I don't remember reading about how the
k/
t gallows sequence compares to random and human-generated pseudo-random sequences. It is likely that I missed something, so please point me to any existing analysis.
There is a blog post by Emma May Smith but it does not discuss the reduplication statistics: You are not allowed to view links.
Register or
Login to view.
--------------------------------------------------
TLDR: The
k/
t gallows sequence is strongly biased toward reduplication; especially long sequences of the same
k/
t gallows in a row.
What is interesting about it is that human-generated pseudo-random sequences have a "tendency to overalternate between outcomes" documented in psychology studies. See for example: You are not allowed to view links.
Register or
Login to view. and the referenced literature.
Why do we see the opposite tendency in the VMS?
--------------------------------------------------
EVA-k is more frequent than EVA-t. In the ZL transliteration, in paragraphs only, there are:
k = 10096, probability(k) = pk = 62%
t = 6063, probability(t) = pt = 38%
In few cases of ambiguity, I kept the first: for example [k:t] = k.
With perfectly random independent draws, in any window of n letters of the k/t sequence, the probability of having:
n times k is pk^n
n times t is pt^n
The expected number of windows of n identical letters in the k/t sequence is the probability multiplied by the number of windows:
For k: (k+t-n+1)*pk^n
For t: (k+t-n+1)*pt^n
The expected numbers are ek, et (rounded), the actual numbers are ak, at:
n ek < ak et < at
2 6307 6956 2274 2924
3 3940 4938 853 1554
4 2461 3593 320 870
5 1538 2686 120 501
6 960 2029 45 300
7 600 1565 16 184
8 375 1214 6 117
9 234 950 2 76
10 146 752 0 50
11 91 595 0 32
12 57 465 0 21
Note: if you count kk... and tt... sequences with a text editor like Notepad++, the numbers will be lower, because it skips to the text following the matched pattern instead of doing a "rolling window" search.