Rene: what does the last bit about labels mean for the mathematically disabled?

Possible differences between labels and parahraphs might help us understand what's going on.
(25-02-2017, 08:12 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Without enumerating them all, the main one (in my opinion) is the pairing of characters. There are several pairs that can be substituted for each other almost arbitrarily:
ch / Sh
k / t
f / p
l / r
o / qo [font=Arial] (at word start)[/font]
However, the frequencies of the resulting words are not evenly distributed, so it does not look like something arbitrary.
It seems to me that by saying that you expect evenly distributed frequencies you presuppose a homogeneous text and that it doesn't depend on the context if it is possible to replace for instance [l] with [r] or [k] with [t]. The text is not homogenous and it depends on the context if a glyph can be substituted. Therefore it is indeed not possible to say that [k] is always a substitute for [t]. But this observation doesn't exclude statements like [kch] is a possible substitute for [tch].
1) The text is not homogenous. If we compare the word frequencies for the first three Quires 1, 2 and 3 (all in Herbal and Currier A) we can see nonuniform distributions. For instance the word [sho] is 7th most frequent word in Quire 1 and the 9th most frequent word in Quire 3 but only the 37th frequent word in Quire 2.
The 20 most frequent word types in Quire 1:
daiin (56), chol (50), chor (47), shol (31), cthy (24),
s (23), sho (21), dain (18), chy (16), dar (14),
dy (14), shor (14), cthol (14), dol (13), or (12),
y (11), shey (11), chaiin (11), chey (10), dal ( 9)
The 20 most frequent word types in Quire 2:
daiin (61), dy (34), chol (28), cthy (27), chor (20),
chy (16), shor (14), s (13), dain (13), cthor (12),
y (11), qotchy (9), shol ( 9), kchy ( 9), otchy ( 9),
dor ( 8), qoty (8), dchy ( 8), shy ( 8), chody ( 7)
The 20 most frequent word types in Quire 3:
daiin (70), chol (37), chor (30), dar (22), s (21),
cthy (17), ol (17), or (17), sho (16), dy (15),
shy (15), y (14), shol (14), qokchy (14), dal (12),
qotchy (12), chy (11), otchy (10), dor ( 9), cthor ( 9)
2) It depends on the context if a glyph is replaceable. If we order the words by their similarity we can see that the frequencies for only some similar words are related. For instance in all three quires the words ending in [-ol] and [-or] are more frequent then words ending in [-al] and [-ar] with [dal] and [dar] as exception. Also the frequencies for the words with a group [tch] match with the frequencies for there [kch] counterparts even if such groups are more common in Quire 3 and 2 then in Quire 1. On the other side this is not true for words with [cth] or [ckh] since words with [ckh] are rare. If you now compare the statistics for [l]/[r] and [t]/[k] for the whole manuscript they include also the statistics for the words that doesn't match your expectations like [dal], [dar] and words with [cth] vs. words with [ckh].
The 20 most frequent word types in Quire 1 as grid including similar word types:
aiin ( 3) daiin (56) chaiin (11) shaiin ( 4) cthaiin ( 4) ckhaiin ( 1)
ain (--) dain (18) chain (--) shain ( 1) cthain (--) ckhain (--)
ar ( 3) dar (14) char ( 8) shar ( 3) cthar ( 7) ckhar ( 1)
al ( 3) dal ( 9) chal ( 4) shal (--) cthal ( 2) cthal ( 1)
ol ( 9) dol (13) chol (50) shol (31) cthol (14) ckhol (--)
or (12) dor ( 7) chor (47) shor (14) cthor ( 6) ckor (--)
y (11) dy (14) chy (16) shy ( 6) cthy (24) ckhy ( 8)
o ( 3) do ( 7) cho ( 7) sho (21) ctho ( 1) ckho ( 1)
ey (--) dey ( 1) chey (10) shey (11)
eey ( 1) deey (--) cheey ( 3) sheey ( 7)
tchol (--) otchol ( 5) qotchol ( 1) qokchol (--) okchol ( 1) kchol ( 1)
tchor ( 4) otchor ( 5) qotchor (--) qokchor ( 1) okchor ( 2) kchor ( 3)
tchy ( 2) otchy ( 4) qotchy ( 3) qokchy ( 3) okchy ( 1) kchy ( 4)
tcho (--) otcho ( 1) qotcho ( 3) qokcho ( 1) okcho (--) kcho ( 1)
The same word grid for Quire 2:
aiin ( 5) daiin (61) chaiin ( 5) shaiin ( 2) cthaiin ( 2) ckhaiin ( 1)
ain (--) dain (13) chain (--) shain (--) cthain (--) ckhain (--)
ar ( 1) dar ( 5) char ( 1) shar ( 1) cthar ( 1) ckhar (--)
al (--) dal ( 3) chal (--) shal (--) cthal (--) ckhal (--)
ol ( 2) dol ( 3) chol (28) shol ( 9) cthol ( 2) ckhol (--)
or ( 5) dor ( 8) chor (20) shor (14) cthor (12) ckhor (--)
y (11) dy (34) chy (16) shy ( 8) cthy (27) ckhy ( 3)
o ( 1) do (--) cho ( 1) sho ( 5) ctho ( 2) ckho (--)
ey (--) dey (--) chey ( 4) shey ( 6)
eey (--) deey (--) cheey (--) sheey (--)
tchol ( 2) otchol ( 3) qotchol ( 2) qokchol ( 2) okchol ( 1) kchol ( 4)
tchor ( 2) otchor ( 5) qotchor ( 5) qokchor ( 3) okchor ( 4) kchor ( 2)
tchy ( 3) otchy ( 9) qotchy ( 9) qokchy ( 6) okchy ( 5) kchy ( 9)
tcho (--) otcho ( 2) qotcho ( 1) qokcho ( 1) okcho (--) kcho (--)
The same word grid for Quire 3:
aiin ( 2) daiin (70) chaiin ( 1) shaiin ( 3) cthaiin ( 2) ckhaiin (--)
ain (--) dain ( 6) chain (--) shain ( 2) cthain ( 1) ckhain (--)
ar ( 6) dar (22) char ( 5) shar ( 1) cthar ( 5) ckhar (--)
al (--) dal (12) chal ( 2) shal (--) cthal ( 3) ckhal (--)
ol (17) dol ( 7) chol (37) shol (14) cthol ( 9) ckhol ( 1)
or (17) dor ( 9) chor (30) shor ( 6) cthor ( 9) ckhor ( 1)
y (14) dy (15) chy (11) shy (15) cthy (17) ckhy ( 3)
o ( 3) do ( 1) cho ( 6) sho (16) ctho ( 2) ckho (--)
ey (--) dey (--) chey ( 8) shey ( 2)
eey (--) deey ( 1) cheey ( 4) sheey ( 2)
tchol ( 3) otchol ( 5) qotchol ( 4) qokchol ( 8) okchol ( 1) kchol ( 5)
tchor ( 2) otchor ( 1) qotchor ( 1) qokchor ( 2) okchor ( 3) kchor ( 4)
tchy ( 5) otchy (10) qotchy (12) qokchy (14) okchy ( 4) kchy ( 4)
tcho ( 1) otcho (--) qotcho (--) qokcho (--) okcho ( 1) kcho (--)
(25-02-2017, 08:12 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.For one, there are several underlying assumptions.
One is, that the text has to be read from left to right. This actually makes essentially no difference. (And as a matter of fact, entropy is independent of the reading direction).
Another is, that the word spaces are real. This is a more complicated question, but if one were to assume that the word spaces are not real, one runs into many other problems.
I did not think that either of these were assumptions. Although we could be wrong, there is evidence that both of these are true: the text runs left to right and spaces are real.
(I actually no longer believe that spaces are as real as we would consider them today. They may separate morphemes but not whole words. That is, grammatical words may span two or more textual words, with the spaces representing a meaningful break to the writer, such as morpheme boundaries. The spaces are thus still important.)
(24-02-2017, 08:09 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view. (24-02-2017, 02:32 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The character y is interesting in yet another way.
It is very frequent, and it seems to behave almost normally.
However, when one makes a list of the vocabulary of Voynichese words, one find that more than one third of all words ends with a y.
I just cannot see how any natural language would behave in this way.
In some languages all or almost all words end in a vowel, and the Voynich text may only have two clear vowels in o and y.
The question whether the frequent appearance of
y at ends of words (about one third of all words) is abnormal clearly deserves a closer look.
Pending that, I should tone down the importance of this observation.
(24-02-2017, 08:09 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view. (24-02-2017, 04:04 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Here's an example of a herbal page with labels:
You are not allowed to view links. Register or Login to view.
The labels, consisting of three Greek letters, are in fact numbers.
The labels we see in the Voynich MS could be verbosely encoded numbers.
Greek letters were regularly used as numbers (as you know). But this is not verbose encoding, rather it is a separate use of the same characters, switching from phonemic to morphographic.
Indeed, and this is also used in other writing systems.
The point I tried to make is that these labels in clm 337 are all one, two or three characters long.
A verbose encoding of these would result in a lower entropy and longer words, and the Voynich MS labels could in theory be such an encoding.
Since there are (I believe) fewer than 13,824 different words in the MS, and 13,824 is 24 * 24 * 24, every Voynich word can in principle be represented by a triplet of Greek characters. Or Latin, of course.
Only a small subset of these also represent a valid number. (Leaving aside the point of the three 'lost' letters).
Even before I started seriously looking at the VMS text, I had a "same but different" feeling about the VMS labels.
Since many old languages use letters for numbers (the difference being indicated by context or with a line or dot), it's entirely possible that this kind of same-but-different characteristic (the same glyphs in the same general formats but not necessarily offering an obvious interpretation when compared with other text) might come about if different parts of the same manuscript used a system like Hebrew, Greek, or other languages in which the letter-forms do double duty as letters or numbers.
I'm not saying this is how the VMS is constructed but there is a clear precedent that could create this kind of pattern.
(05-03-2017, 09:47 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Indeed, and this is also used in other writing systems.
The point I tried to make is that these labels in clm 337 are all one, two or three characters long.
A verbose encoding of these would result in a lower entropy and longer words, and the Voynich MS labels could in theory be such an encoding.
Since there are (I believe) fewer than 13,824 different words in the MS, and 13,824 is 24 * 24 * 24, every Voynich word can in principle be represented by a triplet of Greek characters. Or Latin, of course.
Only a small subset of these also represent a valid number. (Leaving aside the point of the three 'lost' letters).
I'm open to the possibility that the Voynich text is written in a code. I find it a much better solution than a cipher, to be truthful.