So here's some insight. I had an idea that four bigrams
ar,
or,
al and
ol may stand for low numbers such as 1, 2, 3, 4, or for low prime numbers, such as 1, 2, 3, 5 - anyway, for some numbers under 10. The reasons why I thought so were more of the nature of guess, in particular there's the
aror in f116v, for which being a quantifier is one of the strongest possibilities, and also there's the
aralarar in the center of the diagram of f57v. Within the set of the four central vords, this one is clearly outstanding in terms of morphology, that's why I thought it might be something numeric. I expect basic numbers to be represented by vords not of considerable length, that is why I think bigrams are appropriate for that purpose.
About composites, such as e.g.
arar or
oral, whether they would as well stand for numbers or not, and if yes, then what's the principle of generation, - this is a tricky question, and I put it aside for now. The interested reader will find some additional considerations to that direction under the spoiler.
Basically, there are three possibilities:
a) positional, e.g. if ar = 1 and or = 3, then aror = 13;
b) arithmetic, e.g. if ol = 2 and or = 3, then olor = either 2+3 = 5 or 2x3 = 6
c) cipher/nomenclator-driven, e.g. alor = 348 notwithstanding what al and or are - just because some underlying algorithm works that way or the author wishes so.
The option b)-multiplication looks unlikely, since one of the four bigrams standing for 1, we would not see its concatenation with itself. E.g if ar = 1, arar would not make sense. However, all four self-concatenations do exist.
The option b)-addition is less unlikely, and it could be implemented in various ways, e.g. if ar = 1, or = 2, al =3, ol = 4, then or and arar both mean 2, al, aror and orar all mean 3, and so on. However, it's not this ambiguity which makes this option unlikely, but rather the huge frequency gap that would be observed between these bigrams and their quad-gram concatenations. The most frequent concatenation is olor with the frequency of 31, which is way behind the most rare of the four bigrams - the al with its count of 260.
So perhaps there are other simple vords for low numbers, even if these four stand for numbers indeed.
Positional option a) looks OK to me in general, but those would be two-digit numbers in the decimal system, and it might be difficult to explain why they would have such high (in this case) count as e.g. 31, as olor does. Although, say, for the numbers such as 12 or 24 it's normal to be more or less frequently mentioned, but why then the reverse of that, namely 21 or 42 (orol) would be mentioned 15 times, and, say, 11 (olol) - 18 times, is not altogether clear. The positional combinations of digits from 1 to 4 produce some numbers which are really not expected at all - something like 43 or 23, but nonetheless all possible 4-grams do occur, and the lowest frequency count is as high as 5 (for oror and alal).
As a side note, looking (in f57v) at the row of four supposed digits in succession, and assuming that digit concatenation may imply positional numbering, I naturally thought that that might stand for a year, perhaps the then current year. Three similar digits immediately suggest the year 1411 (meaning ar = 1, al = 4) - to be in line with the MS dating. Interestingly, in You are not allowed to view links.
Register or
Login to view. one may also imagine disguised "1411". However, the idea that amongst the four bigrams
ar stands for 1 is not supported by the frequency counts. As we discovered in the numerals frequency thread, one is the most frequent numeral, and in this role we foresee
ol with its prevailing count of 538.
or (366),
ar (352) and
al (260) follow.
This is where I moved to the idea of concatenation, taking some form of German plain text as the working hypothesis (this may work also for any other language where "one" and the indefinite article are the same thing) and supposing that if this principle of concatenation is valid, then we would expect quite a number of occurrences of
ol- as prefix and -
ol as suffix - because in German it is so for the trigram "ein". Which we do, but the question is of the behaviour pattern. I compared the behaviour of
ol with the behaviour of "ein" in the medieval German cookbook referenced by Koen. The results were as follows.
In the cookbook, "ein" occurs in 6,0% of words, while
ol in the Voynich occurs in 14,7% of vords (if we take Q20 only, that drops to 10,4%). The distinct match "ein" is 3,2% versus 1,4% for
ol. 2,2% of words in the cookbook begin with "ein-" (distinct matches excluded hereinafter), while 2,8% of vords begin with
ol-. So far so good, it's more or less on par. But when we take words ending with "-ein", those are very few in number - only 0,2%. While in the VMS the huge number of 8,1% of vords end with
-ol.
But if we look, in the cookbook, at words ending with "-en" instead, they comprise 7,6% of all words! While words starting with "en-" are, on the contrary, very rare - only 0,1%. In other words, words rarely start with "en-" and often end with "-en". It's all opposite with "ein" - words start with "ein-" much more often than they end with "-ein". Respectively, if we combine "ein" and "en", then:
- words starting either with "ein-" or "en-" are 2,3% in total
- words ending either with "-ein" or "-en" are 7,8% in total
Which is on par with the behavior of
ol as prefix and suffix.
Interestingly, "en" occurs in 14,7% of words in the cookbook (nice coincidence with the respective count for
ol), and if we combine that with "ein", that would be 14,7+6,0 = 20,7% of words, - or rather somewhat lower than that, because there are words containing both "en" and "ein" at the same time. Even if as high as 20,7%, this is closer to the 14,7% of
ol than 6,0% of pure "ein" is. As for distinct matches top-up, there are no distinct matches of "en".
Hence my idea of a dialect which does not distinguish between "en" and "ein" - at least in prefixes and suffixes.
... One may wonder: if concatenation holds true and
ol = ein/en (or one etc), then what would be
olol? "Einein" or "oneone"? The more so for other quad-grams, it makes no sense. At least in developed languages we have no words like "twotwo" or "fourthree". This brings the need to suppose that, for example, concatenation as the vord-forming mechanism is not universal and there are rules, or maybe markers, according to which it is, or is not, triggered.
You see that all this is but a handful of raw ideas which require further development and verification, for which I haven't time at the moment, and those normally would not be published, but insofar you asked, then here you are...