(19-05-2026, 10:03 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Except that - You need a seed text (a "visible source") to start the process.
- The modify step must respect the complex word structure.
- You need dice to choose which word to copy and how to modify it.
- The word distribution must be invariant under the modify step.
Without the last precaution, in particular, the word distribution will evolve along the document in ways that we simply don't see.
Your proposed gibberish generation method is anything but simple. It is as complex as the Voynichese "language", with all its statistical and structural peculiarities.
All the best, --stolfi
I agree with most of this, but I think the conclusion is too strong.
Yes, a visible source is needed. That is exactly the point: the manuscript itself repeatedly behaves as if nearby source material matters. Copy-modify from visible sources is not being denied; it is the mechanism being tested. When you first learned your native language, you needed a seed. Every 'language' does. IF the Voynichh is ever solved, there will be a seed involved.
The only real disagreement is over what the modifier has to know. I do not think the scribe needed a full grammar, a cipher table, or a stochastic model of Voynichese. They only needed to avoid producing forms that obviously violate the local word structure. That can be done with a very small set of adjacency habits: which glyphs can follow which glyphs in prefix, middle, and ending positions; i.e. my ledger proposal. Whether that habit was learned mentally or written down as a small crib is secondary.
The “dice” point is also stronger in a computer simulation than in a manuscript-production model. A program needs explicit random choices because it has no human attention, no fatigue, no visual preference, and no habit. A scribe does not need literal dice to choose a nearby word, copy part of it, extend it, or vary it. Human choice supplies the irregularity that a program has to fake with randomization.
The distribution problem is the serious objection. But I do not think the Voynich requires invariant distribution page after page. In fact, the scribal and Currier divisions argue against perfect invariance. What we see is bounded drift: strong local continuity, abrupt regime changes, and then stabilization within a new regime. That is exactly what I would expect if different operators inherited the method but not the same internal weighting of forms.
For example, Scribe 1 and Scribe 3 correlate more closely with each other in internal-bigram profile than Scribe 1 and Scribe 2 do:
Here's a refined and expanded table of the one in the previous posts.
| comparison | Pearson | cosine |
| S1 all vs S3 all | 0.692 | 0.723 |
| S1 all vs S2 all | 0.532 | 0.575 |
| S2 all vs S3 all | 0.943 | 0.948 |
| S1 herbal vs S3 late herbal | 0.691 | 0.723 |
| S1 herbal vs S2 herbal | 0.639 | 0.675 |
| S2 herbal vs S3 late herbal | 0.732 | 0.765 |
| S1 pharma vs S3 pharma | 0.645 | 0.683 |
| S1 herbal vs S3 recipes | 0.615 | 0.651 |
| S2 herbal vs S3 recipes | 0.946 | 0.951 |
That is not “anything goes” drift. It suggests continuity of method with different weighting. Scribe 2 is much more sharply reweighted toward "ed" bigram type forms, while Scribe 3 appears closer to Scribe 1 in some respects, especially when separated by section.
So I would phrase my position this way: self-citation alone is underspecified. But self-citation constrained by local glyph-adjacency habits and visible-source copying is not nearly as complex as the Voynich language itself. It is a small production method that can produce complex-looking output because the visible source already carries much of the structure forward.
Does this negate the fact that the Voynich may have meaning? I don't think so. It may well be some mnemonic system or even Asian as you suspect. But, if this method I'm demonstrating does work well enough, then it may give some insights into how that meaning was encoded.