I think I found something very interesting...
What happens if we stop looking at the whole manuscript as one long string, and instead ask where the long-range signal actually lives?
Up to now, we knew this: if we shuffle words globally in Currier, the long-range MI gap is large and clearly significant. That already sets Voynich apart from normal language corpora, where word shuffle barely changes the tail.
But that still leaves an open question. Is the gap coming from inside lines? Inside paragraphs? Or from the larger structure?
So I ran three different shuffles:
- Shuffle words globally (baseline).
- Shuffle words only inside each line.
- Shuffle words only inside each paragraph.
Before showing the numbers, a quick clarification of what I calculted:
The
tail gap means the average difference between the mutual information of the original text and the mutual information after word shuffle, measured at long distances (here d = 60–100). In simple terms, it tells us how much long-range structure disappears when we randomize word order.
The
normalized gap is the same quantity divided by H1, the basic entropy of the character distribution. This just rescales the gap so that differences in alphabet size or overall entropy do not distort the comparison.
Here is the summary for Currier (punctuation removed, MI normalized by H1).
| Shuffle scheme | Tail gap (MI raw − shuffle) | Normalized gap (÷ H1) |
| Global token shuffle | 0.00250 | 0.00094 |
| Shuffle within lines | ≈ 0 | ≈ 0 |
| Shuffle within paragraphs | ≈ 0 | ≈ 0 |
The result is very clear.
1. When words are shuffled only inside lines, the long-range gap disappears.
2. When words are shuffled only inside paragraphs, the gap also disappears.
In other words, the signal is not generated inside lines. It is not generated inside paragraphs either. The long-range gap only appears when the global order of the text is disturbed.
To check this further, I shuffled entire lines as intact blocks, and then entire paragraphs as intact blocks.
| Shuffle scheme | Tail gap | Normalized gap |
| Shuffle order of lines (lines kept intact) | 0.00251 | 0.00094 |
| Shuffle order of paragraphs (paragraphs kept intact) | 0.00062 | 0.00023 |
Shuffling the order of lines changes almost nothing.
Shuffling the order of paragraphs reduces the gap strongly, almost eliminating it.
This tells us something important: The long-range MI gap in Currier is not a micro-level effect. It does not come from word order inside lines. It does not come from local syntactic structure. It is mainly a macro-structural effect tied to how paragraphs are arranged across the manuscript, and likely also how larger sections are organized.
If we compare this to natural language corpora, we see a difference. In normal texts, global word shuffle does not produce a large long-range gap in the first place. Here, the gap appears only when we break the global block structure of the manuscript.
So a cautious way to phrase it would be:
Inside lines and inside paragraphs, Voynich does not produce the long-range anomaly. The anomaly emerges at the level of paragraph sequencing and higher-level organization.
This is more consistent with block-level non-stationarity than with some kind of long-distance “interaction” between glyphs. This suggests that the long-range anomaly in Voynich is not driven by local word-order constraints inside paragraphs, which makes it less consistent with a purely local sequential generator such as Torsten Timm’s model. Instead, the effect appears to be tied to block-level organization of the manuscript.
One small side note. When punctuation and separators are removed from Currier, the gap actually increases, even after normalizing by entropy. That means the effect is not caused by dots or special symbols. Those elements were diluting the signal, not creating it.
Taken together, this suggests that the long-range behavior of the Voynich manuscript is primarily a property of its large-scale structure, not of its internal line-level syntax.