10-09-2016, 05:26 PM
I recently made the remark in another thread that there is some relationship between [daiin] and [aiin], and words beginning [d] and [a] in general. Now that I've had time to check my notes I realize that I was somewhat wrong, and would like to here present a correction and an explanation. I hope that other will provide comments on my observations, which I present below in condensed form.
Problem
Words beginning [a] are common in the Voynich text. There are nearly 2,000 word tokens beginning [a]. But at the start of lines they are rare. Only around 25 word tokens beginning [a] are found at the start of lines. The most common word type beginning [a] is [aiin], which has 0 occurrences at the start of a line. Most such words are similar.
The statistics for the first characters of words at the beginning of lines are rather divergent as a whole. The characters [y, d, p, s, t, f] occur more often, the rest less often, than in the main text. For [p, f] and maybe partially [t], the cause is the well known phenomenon of Grove Words. For [y, d, s] the cause is unknown.
Although we should not expect the text of the manuscript to be completely flat and the same throughout, we should still seek to explain variations. That's the most likely place we will learn something new. So we should presume there is a reason behind the lack of words beginning [a] at the start of lines, and that the reason is discoverable.
Argument
An hypothesis for explaining both the lack of [a] beginning words and the high occurrence of [y, d, s] beginning words could be that those letters are added to the beginning of [a] words.
It is impossible that [y] is added to the beginning of [a] words, nominally because [ya] strings are rare but also theoretically because they are likely very similar characters.
The character [d] could be added, but words beginning [da] are not hugely overrepresented at the start of lines, though there is some tendency toward this in Quire 20. It could be partially responsible for our observations.
The character [s] is the best fit for this role. Of around 1090 word tokens beginning [s], about 470, or 43% occur at the start of lines. For words tokens beginning [sa] the figures are about 190 of 510 that occur at the start of lines, or 37%. These are obviously more common at the start of lines than we should expect, with an excess of around 120. Their occurrence in the main text suggest that they are also valid words normally.
Conclusion
The lack of words beginning [a] at the start of lines may be caused by an unknown process which adds [s] to their beginning. This would also explain the high number of words beginning [sa] in that position. The same process may cause [s] to be added to words beginning [o], as words beginning [o] are less common at the start of lines and those beginning [so] more common.
If the existence of a process of this kind is accepted we would look to generalize to explain the presence of words beginning [y, d] too. The character [d] is especially interesting as it has already been implicated in the lack of words beginning [a].
Problem
Words beginning [a] are common in the Voynich text. There are nearly 2,000 word tokens beginning [a]. But at the start of lines they are rare. Only around 25 word tokens beginning [a] are found at the start of lines. The most common word type beginning [a] is [aiin], which has 0 occurrences at the start of a line. Most such words are similar.
The statistics for the first characters of words at the beginning of lines are rather divergent as a whole. The characters [y, d, p, s, t, f] occur more often, the rest less often, than in the main text. For [p, f] and maybe partially [t], the cause is the well known phenomenon of Grove Words. For [y, d, s] the cause is unknown.
Although we should not expect the text of the manuscript to be completely flat and the same throughout, we should still seek to explain variations. That's the most likely place we will learn something new. So we should presume there is a reason behind the lack of words beginning [a] at the start of lines, and that the reason is discoverable.
Argument
An hypothesis for explaining both the lack of [a] beginning words and the high occurrence of [y, d, s] beginning words could be that those letters are added to the beginning of [a] words.
It is impossible that [y] is added to the beginning of [a] words, nominally because [ya] strings are rare but also theoretically because they are likely very similar characters.
The character [d] could be added, but words beginning [da] are not hugely overrepresented at the start of lines, though there is some tendency toward this in Quire 20. It could be partially responsible for our observations.
The character [s] is the best fit for this role. Of around 1090 word tokens beginning [s], about 470, or 43% occur at the start of lines. For words tokens beginning [sa] the figures are about 190 of 510 that occur at the start of lines, or 37%. These are obviously more common at the start of lines than we should expect, with an excess of around 120. Their occurrence in the main text suggest that they are also valid words normally.
Conclusion
The lack of words beginning [a] at the start of lines may be caused by an unknown process which adds [s] to their beginning. This would also explain the high number of words beginning [sa] in that position. The same process may cause [s] to be added to words beginning [o], as words beginning [o] are less common at the start of lines and those beginning [so] more common.
If the existence of a process of this kind is accepted we would look to generalize to explain the presence of words beginning [y, d] too. The character [d] is especially interesting as it has already been implicated in the lack of words beginning [a].