david > 18-12-2020, 01:06 AM
Quote:the Curve-Line System is an intentional feature of the text design, and the text of the Voynich Manuscript is a highly artificial system.
Quote:4.2.3 Results
Currier language A
There are 4,040 invalid words in a text of 10,645. That is 37.95% of the t
The body of 4,040 invalid words is comprised of 1,328 distinct words, a ratio of 32.87 (to 2 dp).
Of these 1,328 distinct words, 793 (59.71% to 2dp) are ungrammatical by one inconsistency, 404 (10%) are ungrammatical by two inconsistencies, 102 (7.68% to 2dp) by three inconsistencies, 26 by four inconsistencies and 3 by five inconsistencies.
Currier language B
There are 5,870 invalid words in a text of 20,969. That is 27.99% of the total.
The body of 5,870 invalid words is comprised of 1.877 distinct words, a ratio of 31.98 (to 2 dp).
Of these 1,877 distinct words, 1,116 (59.45% to two decimal places) are ungrammatical by one inconsistency, 606 (32.29%) are ungrammatical by two inconsistencies, 128 (6.82% to 2dp) are ungrammatical by three inconsistencies, 24 by four inconsistencies and 3 by five inconsistencies.
Total
Out of a total of 31,614 words tested, 9,910 are invalid. That is 31.35% of the total. The total of unique aberrant words across the whole corpus has not been tested.
Quote:Manually skimming through the list of non-conforming words, David noted that almost half had “l” as the first letter. Looked like a good place to start, so I tested word-conformance rate across different beginnings of all words in the manuscript text. Most were above 90%, but there were exceptions: words starting with “l” were about 14.7% conforming and those starting with “r” were 40.8% conforming.
Could this be explained by the idea that “l” and “r” can be prefixed to a word arbitrarily? Turns out that words with these prefixes are otherwise conforming to the CLS without them, confirming my suspicion.
Quote:Three aberrant glyphs which only have medium or high conformity to the proposed CLS system.
However, these three aberrant glyphs conform to very specific rules, and seem to be part of specific ngrams that occur due to some as-yet-unidentified, but very specific, reason.
- “o” is aberrant 44.51% of the time, when it appears in the following bigrams: “ol”, “or” and (rarely) “lo”, “ro” (where “ro” could be a confusion for “lo”).
- “l” is aberrant 26.83% of the time, when it appears in the following bigrams: “lo” (see rule 1), “ly”, “ld”. Furthermore, these two bigrams always appear in the following trigrams: “oly”, “aly”, “old”, “ald”.
- “r” is aberrant 15.76% of the time, when it appears in the following bigrams: “ro” (see rule 1), “ry”, “ra”. These last two bigrams are almost always part of the following trigrams: “ara”, “ora”, “ary”, “ory”.
VViews > 18-12-2020, 10:42 AM
Anton > 18-12-2020, 06:23 PM
Emma May Smith > 19-12-2020, 10:32 PM