09-10-2025, 01:53 PM
31-10-2025, 11:46 AM
I would like to share some further investigation I have made on vertical pairs. In a previous post in this thread I showed that repeats of the first characters of line first words were less than expected. I though it would be instructive now to get some precise measures. For each of the main work groups
Bio B2
Herbal A1
Herbal B2
Pharma A1
Stars B3
Text B2
I obtained the total number of repeats ( of first characters of line first words ) and then did 1000 simulations in which the first characters were randomly set according to the frequencies of the line first word first characters for the work group and obtained 1000 values of simulated repeats. By the central limit theorem of statistics these values can be expected to be normally distributed and I got the following results.
[attachment=11891]
The final column gives the number of standard deviations that the actual number of repeats exceeds the expected. This shows that all sections of the manuscript exhibit the same behaviour and the behaviour is statistically significant. The numbers of repeats are far below expected. In particular Herbal A1 has 41 repeats when 131 were expected. Anything over 4 standard deviations is a huge distance.
I was curious then see if there was any similar behaviour with first word last character pairs. I repeated the investigation and got
[attachment=11892]
Only Stars B3 shows something odd this time. The repeats occurred more often than expected. 205 observed, 163 expected.
For second words and later repeats are generally as expected, with the exception of Stars B3, once again, for second words.
[attachment=11893]
[attachment=11894]
This is all tricky. I am lost for an explanation, especially about StarsB3.
Throughout, I only took paragraph text from the GC transliteration. In case you are curious here are the first word last characters lists for StarsB3.
[attachment=11895]
Bio B2
Herbal A1
Herbal B2
Pharma A1
Stars B3
Text B2
I obtained the total number of repeats ( of first characters of line first words ) and then did 1000 simulations in which the first characters were randomly set according to the frequencies of the line first word first characters for the work group and obtained 1000 values of simulated repeats. By the central limit theorem of statistics these values can be expected to be normally distributed and I got the following results.
[attachment=11891]
The final column gives the number of standard deviations that the actual number of repeats exceeds the expected. This shows that all sections of the manuscript exhibit the same behaviour and the behaviour is statistically significant. The numbers of repeats are far below expected. In particular Herbal A1 has 41 repeats when 131 were expected. Anything over 4 standard deviations is a huge distance.
I was curious then see if there was any similar behaviour with first word last character pairs. I repeated the investigation and got
[attachment=11892]
Only Stars B3 shows something odd this time. The repeats occurred more often than expected. 205 observed, 163 expected.
For second words and later repeats are generally as expected, with the exception of Stars B3, once again, for second words.
[attachment=11893]
[attachment=11894]
This is all tricky. I am lost for an explanation, especially about StarsB3.
Throughout, I only took paragraph text from the GC transliteration. In case you are curious here are the first word last characters lists for StarsB3.
[attachment=11895]
