01-04-2021, 08:28 AM
01-04-2021, 12:17 PM
It's been a pun on my part, for sure 

10-04-2021, 10:40 PM
Here's an idea I had that may incorporate elements of Anton's analysis of the predictable repetitive occurrence of line-initial characters in a particular order, Rene's You are not allowed to view links. Register or Login to view. of entropy per character within each vord, You are not allowed to view links. Register or Login to view. about "scinderatio fonorum", You are not allowed to view links. Register or Login to view. about Alberti's techniques, and You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view. You are not allowed to view links. Register or Login to view. that we must address problematic issues of vord-internal structure of individual vords as well as of the order of the vords and lines themselves:
Rene's analysis points out that the 1st and 2nd characters or units of Voynich vords have less entropy (that is, are more predictable and contain less information) than European natural languages, but from the 3rd character or unit onwards Voynich vords actually have more entropy (are less predictable and contain more information) than European natural languages. (It is important to note that this analysis used the Cuva transcription rather than EVA. Cuva treats certain well-known sequences such as EVA [aiin] as single units.)
So here's my idea: What if the Voynich ms text employed a 2-character or 2-unit shift in every vord, in the following way:
The text was encrypted by shifting the last 2 characters or units of each vord forward and attaching them to the beginning of the following vord.
To decipher this encryption, shift the first 2 characters or units of each vord backward and attach them to the end of the preceding vord.
Important: The original last 2 characters or units of the last vord of each line could have been simply deleted, and most strikingly, the first 2 characters or units of the first vord of each line could simply be any made-up filler or null characters!
This would actually explain a lot of the suspiciously predictable patterns of line-initial characters, such as Anton's discovery presented in this thread of the predictable repetitive occurrence of line-initial characters in a particular order. If these are all just filler characters with null values that the author was free to choose as he saw fit without regard to their meaning, then it would explain why he would have been able to always arrange the (meaningless) line-initial characters in the same predictable order within each paragraph.
Geoffrey
Rene's analysis points out that the 1st and 2nd characters or units of Voynich vords have less entropy (that is, are more predictable and contain less information) than European natural languages, but from the 3rd character or unit onwards Voynich vords actually have more entropy (are less predictable and contain more information) than European natural languages. (It is important to note that this analysis used the Cuva transcription rather than EVA. Cuva treats certain well-known sequences such as EVA [aiin] as single units.)
So here's my idea: What if the Voynich ms text employed a 2-character or 2-unit shift in every vord, in the following way:
The text was encrypted by shifting the last 2 characters or units of each vord forward and attaching them to the beginning of the following vord.
To decipher this encryption, shift the first 2 characters or units of each vord backward and attach them to the end of the preceding vord.
Important: The original last 2 characters or units of the last vord of each line could have been simply deleted, and most strikingly, the first 2 characters or units of the first vord of each line could simply be any made-up filler or null characters!
This would actually explain a lot of the suspiciously predictable patterns of line-initial characters, such as Anton's discovery presented in this thread of the predictable repetitive occurrence of line-initial characters in a particular order. If these are all just filler characters with null values that the author was free to choose as he saw fit without regard to their meaning, then it would explain why he would have been able to always arrange the (meaningless) line-initial characters in the same predictable order within each paragraph.
Geoffrey
11-04-2021, 12:28 PM
Hmm... The idea sounds interesting, however I doubt that this would substantially change the second-order character entropy, because with this kind of transformation, essentially we are just shifting spaces to other positions, and nothing more. Furthermore, supposing that line-initial characters are fillers, this does not explain why the writer would always generate fillers in sorted order for the first two lines, almost always - for the first three lines, and far less often - for the subsequent lines.
I have an idea that what we see as lines in the VMS (at least in the Herbal section) may be half-lines of the original text instead, which were subsequently interleaved.
All in all, I would not advise to derive broad conclusions so far from the observations which I described above; what I looked at are just k-initial paragraphs - which is a minority of the overall number of paragraphs (although a substantial one), - and only for the first six quires - which is of course a minority of the whole text.
I have an idea that what we see as lines in the VMS (at least in the Herbal section) may be half-lines of the original text instead, which were subsequently interleaved.
All in all, I would not advise to derive broad conclusions so far from the observations which I described above; what I looked at are just k-initial paragraphs - which is a minority of the overall number of paragraphs (although a substantial one), - and only for the first six quires - which is of course a minority of the whole text.
11-04-2021, 12:53 PM
Here are a few examples of what the original text could have looked like, before its encryption by shifting the last 2 characters or units (using Cuva, not EVA) of each vord forward and attaching them to the beginning of the following vord to create the actual Voynich ms text. In other words, I created the text below by shifting the first 2 characters or units of each vord backward and attaching them to the end of the preceding vord. Importantly, I also deleted the first 2 characters or units of each line, which could have been null filler characters according to this idea. 
f1v, 1st para:
[sycha daiinol ol tcheycha rcfha ram..]
[eeaycha ror och ydch olk odyok odarcho dy..]
[ckhy ckho ckhy shy dk sheeycthy ko tchodyda l..]
[lcho keodair da mso cheycho kody..]
f20r, 1st para:
[chodycho pyche eyqo tcholqo toeeydch or choiin..]
[deycthe ycho tolod aiirqo tchyctho dycho dchy..]
[teeycho cho daiinsho qo chyche ytch eodalda ral..]
[olol te eyot olchey..]
f77r, last para:
[lshe dyqo eedyqo kaiinchcph eyqo llt aiinshe dyqo l..]
[aiinche eyshe ckheylsh eeyqy kaiinshe edylaiin ..]
[ol ee edyok aiinshe eolqo taiinsho dyqo ty..]
[sh qo lcheyqo tedyqo talrain chl rar ol..]
[keedyqo tedyqo keedyqo keedyqo keeyraiin al ..]
[eeol che edyqo taiinqo teedyqo tedyraiin ..]
[eyqe pchedyqo lche edyqo kearche eylo lydy ..]
[kaiinche slch earda lchr sl s ain ol raiin lo d..]
[chl chpsheeyta lche olda mar oteydaiin ..]
[eeyda lsaiir olsa lda lotain da ryda lo..]
[darol oky..]
I chose the last passage because it contains the famously repetitive [qokeedy qotedy qokeedy qokeedy qokeey ...] line. It is still repetitive, but now as [keedyqo tedyqo keedyqo keedyqo ...].
Note: I treated [ee] as 2 separate units for this exercise. Perhaps they should be treated as a single unit.
I deliberately avoided imposing my own ideas (well, that I took from Koen and ran with) about verbose cipher units such as [od], [ok], [ot] on this analysis. I stuck to the traditional interpretation of [o] as its own single unit everywhere.
Observations:
We find a much greater variety of vord-initial and vord-final characters and sequences in the text above! This looks much more like a European natural language pattern, at least in this respect.
However, this transformation now produces a different, less blatantly obvious but nevertheless still present, curious restriction on positional character/unit occurrence: The restriction has been shifted to the antepenultimate (third to last) character of each vord! If a vord indeed still has 3 or more characters/units after this transformation, the antepenultimate character is now in the great majority of vords either [y], [n], [l], or [r]. I have never heard of anyone doing a study of the letter distribution of the antepenultimate letters of words in any natural language, but I doubt it could be so restricted as this. For example, German "bleiben" would be unlikely to be allowed to be represented with such a restriction on antepenultimate letters to a small set of very frequent letters.
Still, a big advantage of this approach is that [aiin], for example, can now occur at any position in the vord: initial, second, medial, or final. In this respect it now behaves much more like an actual unit in an actual natural language. Likewise, [e], [o], and [a] can now occur at any position in the vord: initial or final as well as second or other medial positions. Again this seems much more like actual natural language.
f1v, 1st para:
[sycha daiinol ol tcheycha rcfha ram..]
[eeaycha ror och ydch olk odyok odarcho dy..]
[ckhy ckho ckhy shy dk sheeycthy ko tchodyda l..]
[lcho keodair da mso cheycho kody..]
f20r, 1st para:
[chodycho pyche eyqo tcholqo toeeydch or choiin..]
[deycthe ycho tolod aiirqo tchyctho dycho dchy..]
[teeycho cho daiinsho qo chyche ytch eodalda ral..]
[olol te eyot olchey..]
f77r, last para:
[lshe dyqo eedyqo kaiinchcph eyqo llt aiinshe dyqo l..]
[aiinche eyshe ckheylsh eeyqy kaiinshe edylaiin ..]
[ol ee edyok aiinshe eolqo taiinsho dyqo ty..]
[sh qo lcheyqo tedyqo talrain chl rar ol..]
[keedyqo tedyqo keedyqo keedyqo keeyraiin al ..]
[eeol che edyqo taiinqo teedyqo tedyraiin ..]
[eyqe pchedyqo lche edyqo kearche eylo lydy ..]
[kaiinche slch earda lchr sl s ain ol raiin lo d..]
[chl chpsheeyta lche olda mar oteydaiin ..]
[eeyda lsaiir olsa lda lotain da ryda lo..]
[darol oky..]
I chose the last passage because it contains the famously repetitive [qokeedy qotedy qokeedy qokeedy qokeey ...] line. It is still repetitive, but now as [keedyqo tedyqo keedyqo keedyqo ...].
Note: I treated [ee] as 2 separate units for this exercise. Perhaps they should be treated as a single unit.
I deliberately avoided imposing my own ideas (well, that I took from Koen and ran with) about verbose cipher units such as [od], [ok], [ot] on this analysis. I stuck to the traditional interpretation of [o] as its own single unit everywhere.
Observations:
We find a much greater variety of vord-initial and vord-final characters and sequences in the text above! This looks much more like a European natural language pattern, at least in this respect.
However, this transformation now produces a different, less blatantly obvious but nevertheless still present, curious restriction on positional character/unit occurrence: The restriction has been shifted to the antepenultimate (third to last) character of each vord! If a vord indeed still has 3 or more characters/units after this transformation, the antepenultimate character is now in the great majority of vords either [y], [n], [l], or [r]. I have never heard of anyone doing a study of the letter distribution of the antepenultimate letters of words in any natural language, but I doubt it could be so restricted as this. For example, German "bleiben" would be unlikely to be allowed to be represented with such a restriction on antepenultimate letters to a small set of very frequent letters.
Still, a big advantage of this approach is that [aiin], for example, can now occur at any position in the vord: initial, second, medial, or final. In this respect it now behaves much more like an actual unit in an actual natural language. Likewise, [e], [o], and [a] can now occur at any position in the vord: initial or final as well as second or other medial positions. Again this seems much more like actual natural language.
11-04-2021, 06:20 PM
Continuing the k-initial analysis: through quires 1-8, out of 234 paragraphs 39 begin with k (17%). Of these 39, 33 paragraphs (or 85%) sustain the "sorted" order of line-initial glyphs for the first three lines. The proposed order is: k t y d ch o l q s sh
EDIT: Taking two more quires into account, it's 39 of 242 = 16% of paragraphs (Q9 and Q10 have no k-initial paragraphs).
EDIT: Taking two more quires into account, it's 39 of 242 = 16% of paragraphs (Q9 and Q10 have no k-initial paragraphs).
11-05-2021, 06:57 AM
Hello Anton,
This is a very relevant observation!!
Have you investigated any further?
These vertical patterns, and their absent, may unfold in many interesting conjectures.
Quick edit: Ps: I am not new here, I just never write anything given the lack of anything constructive to add.
This is a very relevant observation!!
Have you investigated any further?
These vertical patterns, and their absent, may unfold in many interesting conjectures.
Quick edit: Ps: I am not new here, I just never write anything given the lack of anything constructive to add.
11-05-2021, 07:35 PM
(11-05-2021, 06:57 AM)Lordadef Wrote: You are not allowed to view links. Register or Login to view.Have you investigated any further?
Yes, I have... I checked all k-starting paragraphs in all quires except Q20 (which I will shortly investigate as well). I also have some checks to do for the cases when the supposed order does not hold - there clearly are such cases when you can't write this off to any scribal error or mistaken paragraph border, - but I have an idea that that might be correlated with some other stuff happening in the line...
15-05-2021, 08:22 PM
In the meantime I can share the following chart.
[attachment=5534]
This shows the percentage of k-initial paragraphs per quire, with some additional breakdowns by sections and Currier languages below. The Currier language information is taken from You are not allowed to view links. Register or Login to view..
The counts for the chart were done manually against the available scans.
Overall, there are 77 k-initial paragraphs out of the total of 750 paragraphs, or 10,3% on average. On the other hand, according to You are not allowed to view links. Register or Login to view., k-initial vords (1155) represent only ~3% of approx. 38k vords in the VMS. In other words, paragraph-initial vords tend to begin with k much more often than a general average vord does.
As it can be seen from the chart, there is no strong correlation between the percentage of k-initial paragraphs and the thematical section (herbal, astro etc.) or the Currier language.
[attachment=5534]
This shows the percentage of k-initial paragraphs per quire, with some additional breakdowns by sections and Currier languages below. The Currier language information is taken from You are not allowed to view links. Register or Login to view..
The counts for the chart were done manually against the available scans.
Overall, there are 77 k-initial paragraphs out of the total of 750 paragraphs, or 10,3% on average. On the other hand, according to You are not allowed to view links. Register or Login to view., k-initial vords (1155) represent only ~3% of approx. 38k vords in the VMS. In other words, paragraph-initial vords tend to begin with k much more often than a general average vord does.
As it can be seen from the chart, there is no strong correlation between the percentage of k-initial paragraphs and the thematical section (herbal, astro etc.) or the Currier language.
16-05-2021, 01:17 AM
According to the same VQP data, there are 116 lines starting with k. This means that 66.4% of those fall onto paragraph starts.
Following that, here's some quick data that I can share right now.
Below is the breakdown of line-initial characters in k-initial paragraphs:
k (paragraph-initial included) - 80
o - 79
d - 70
q - 60
y - 58
s - 32
sh - 29
t - 26
ch - 22
p - 7
l - 4
r - 1
and also one occurrence falls onto the strange character which is like the English letter "e" (see You are not allowed to view links. Register or Login to view. p4).
The sum of the above is 469, so I still miss one of 470 total lines somewhere, but I'm too sleepy to seek.
However, if we consider only the first three lines of paragraphs, then the breakdown is as follows:
k (paragraph-initial included) - 78
d - 38
o - 30
y - 26
q - 23
t - 9
ch - 8
sh - 7
s - 6
l - 2
r - 0
p - 0
In other words, the second and the third line begin mostly with one of four characters - d, o, y and q (78% of 2nd and 3rd lines vs 68% of all lines except the 1st).
One passing observation is that a appears to be exceedingly rare in the line starting position. It never occurs as such in k-initial paragraphs, and the overall count is only 25 (VQP). This given that there are 1962 a-initial vords (which is even more than k-initial).
Considering the order proposed earlier: [font=Eva]k t y d ch o l q r s sh p[/font], 72,7% of k-initial paragraphs match this order. Moving sh to the position between ch and o increases this figure to 74,0%. Furthermore, if we consider gallows as the symbol "resetting" the order (only t occurs within the first three lines along with k), then the figure rises to 81,8%.
Following that, here's some quick data that I can share right now.
Below is the breakdown of line-initial characters in k-initial paragraphs:
k (paragraph-initial included) - 80
o - 79
d - 70
q - 60
y - 58
s - 32
sh - 29
t - 26
ch - 22
p - 7
l - 4
r - 1
and also one occurrence falls onto the strange character which is like the English letter "e" (see You are not allowed to view links. Register or Login to view. p4).
The sum of the above is 469, so I still miss one of 470 total lines somewhere, but I'm too sleepy to seek.
However, if we consider only the first three lines of paragraphs, then the breakdown is as follows:
k (paragraph-initial included) - 78
d - 38
o - 30
y - 26
q - 23
t - 9
ch - 8
sh - 7
s - 6
l - 2
r - 0
p - 0
In other words, the second and the third line begin mostly with one of four characters - d, o, y and q (78% of 2nd and 3rd lines vs 68% of all lines except the 1st).
One passing observation is that a appears to be exceedingly rare in the line starting position. It never occurs as such in k-initial paragraphs, and the overall count is only 25 (VQP). This given that there are 1962 a-initial vords (which is even more than k-initial).
Considering the order proposed earlier: [font=Eva]k t y d ch o l q r s sh p[/font], 72,7% of k-initial paragraphs match this order. Moving sh to the position between ch and o increases this figure to 74,0%. Furthermore, if we consider gallows as the symbol "resetting" the order (only t occurs within the first three lines along with k), then the figure rises to 81,8%.
