(Today, 12:04 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.My intuition tells me that no natural language would behave in the way the Voynich Manuscripts k words behave, simply because this is not the way language sequences are constructed. No elements of any language, be it words, syllables or phonemes, would produce a near perfect independently uniform distribution of popular prefixes and suffixes for any fixed central sequence.
There are two separate properties here. For each core,
P1. The
word types with that core are
all possible combinations of prefixes and suffixes.
P2. The prefixes and suffixes are chosen independently when composing the text.
Property 1 could be true, for example, if the text is encrypted with a codebook cipher. In fact, You are not allowed to view links.
Register or
Login to view., that could also make the distribution of word type lengths nicely fit a binomial function choose(n,k), like that of the VMS (and of Vietnamese...).
Property 1 does not hold for European languages in the plain (or with simple substitution ciphers), for sure. But monosyllabic languages come closer to satisfying it. Typically, the set of potential syllables (those that are allowed by phonetic constraints) is only a few thousand. On the other hand, the size of any language's vocabulary -- the number of terms whose meaning cannot be deduced from the grammar, but must be explicitly listed in a dictionary -- seems to be sort of like a "linguistic universal": several tens of thousands.
Monosyllabic languages generally cope with that mismatch by (1) having many homophones - words like English "to", "two", and "too", with very different meanings but same sounds, that can be distinguished by context; (2) massive use of compounds - combinations of two or more syllables with specific meanings that are only vaguely related to the meanings of the parts, like English "typewriter", and (3) using a large percentage of the potential syllables.
Joining all possible prefixes (initial consonants), cores (initial glides and main vowels), and suffixes (final glides and final consonants) of Mandarin would give You are not allowed to view links.
Register or
Login to view.. However, each core has a limited subset of compatible prefixes and suffixes. Taking these restrictions into account reduces the number of potential syllables to less than 2900 (not sure about the exact number). Of those, about 1300 (more than 44%) are actually used in Mandarin. That is, Property 1 is much closer to being satisfied by the Mandarin words with a given core than it is by all English words that contain "t".
Property 2 is normally not seen in natural languages, because meanings are assigned to individual prefix-core-suffix combinations; and any text will use only a "random" subset of the possible combinations., and the frequencies of those combinations will be "random" too. Even the monosyllabic languages that come close to satisfying Property 1 will usually fail Property 2. That Shennong Bencao file that I posted uses only ~630 of the ~1300 meaningful Mandarin syllables, and surely it does not satisfy either property.
For the same reason, Property 2 is also not expected in a natural language text encoded with a codebook cipher, not even one that satisfies Property 1.
But, in spite of what one may think glancing at the colored tables,
Voynichese does not satisfy Property 2 either. The frequencies of word types with a given core are not simply the prefix frequencies times the suffix frequencies. The deviations from independence are smaller than what we see in the SBJ file, but they exist and are significant. For example, in the Starred Parags section, I count
56 otedy 56 oteedy
2 ytedy 12 yteedy
47 okal 44 okar
0 ykal 6 ykar
Quote:in the Voynich MS most popular suffixes after some central characters can be chosen independently of the prefix. The prefix doesn't appear to affect which suffixes you can use.
So this claim seems to be false.
Granted, Voynichese seem to be somewhat closer to satisfying Property 2 than Mandarin is. But there are all those possible explanations that I listed for why this may be happening.
All the best,