27-06-2025, 10:34 AM
Correction
Originally, I described the transformation used as a homophonic cipher, but that label is misleading. What I actually applied was a form of multi-character substitution, where each letter in the original word is replaced by a randomly chosen variant (e.g., a0, a1, a2), simulating a kind of randomized expansion at the character level. This isn't a true homophonic cipher in the historical sense — which typically replaces plaintext characters with multiple possible cipher symbols without increasing the total character count. My version expanded the text significantly and altered its structure.
Despite the naming inaccuracy, the method did reproduce an entropy curve similar to the Voynich CUVA profile, especially in the characteristic “bump” around n=3–6. The results still support the hypothesis that some kind of structured substitution — possibly at the syllable or morph level — could account for the entropy behavior in the Voynich manuscript. However, any conclusions should be interpreted with this clarification in mind.
You can also check this post of mine where you can see the entropy bump comparing the MS in EVA and in CUVA versus natural languages texts:
You are not allowed to view links. Register or Login to view.
Maybe by accident, I’ve pulled on a thread worth following — I’ll keep exploring what really generates the bump.
------------------------------------------
In this experiment, I tried to simulate how different historical ciphers affect the entropy profile of a text, and compare the results to the Voynich CUVA (explained here You are not allowed to view links. Register or Login to view. by René Zandbergen). The idea was to test whether the statistical behavior of the Voynich text—especially its distinctive “entropy bump”—could emerge from known cipher types.
Method
I took the Latin text De Docta Ignorantia and applied 10 classical cipher transformations likely known or possible in the 15th century:
I then plotted these values against the Voynich CUVA section.
![[Image: uLOSZCq.png]](https://i.imgur.com/uLOSZCq.png)
This graph shows that most cipher types produce entropy curves that drop steeply after n=3–5, while the Voynich text declines gradually and smoothly. This is already unusual.
But there's one exception...
Homophonic cipher anomaly
Only the homophonic cipher (3+ variants tested) produces an entropy “bump” that matches the Voynich profile. Specifically, when using a homophonic cipher with 3 or 4 characters per symbol, the entropy curve is smoother and shows a slow decay, similar to the CUVA data.
This raises two hypotheses:
![[Image: 4aEWNbM.png]](https://i.imgur.com/4aEWNbM.png)
Notice how the 3- and 4-character homophonic ciphers almost replicate the Voynich curve — both in shape and range. The 2-character version decays a bit faster but still mimics the bump.
Natural text vs. Voynich
To test if this was just a quirk of De Docta Ignorantia, I took four different natural texts (Latin, French, English):
Each was encrypted with a 3-character homophonic cipher and compared to Voynich CUVA.
![[Image: kSTbMuI.png]](https://i.imgur.com/kSTbMuI.png)
Interestingly, when using a 3-character homophonic cipher on natural texts (Latin, French, English), the entropy curves become much smoother and more sustained. For several of them, the n-gram entropy remains high up to n=6–7, and only drops significantly past n=8 or n=9.
The curve shapes are now visibly closer to Voynich CUVA, with the most similar being De Docta Ignorantia and Romeo and Juliet. However, the Voynich text still has:
Interpretation
There are two key features that stand out:
It may also support theories that posit an artificial language, a constructed morphology, or template-driven word generation, all of which maintain internal consistency over longer n-grams.
Originally, I described the transformation used as a homophonic cipher, but that label is misleading. What I actually applied was a form of multi-character substitution, where each letter in the original word is replaced by a randomly chosen variant (e.g., a0, a1, a2), simulating a kind of randomized expansion at the character level. This isn't a true homophonic cipher in the historical sense — which typically replaces plaintext characters with multiple possible cipher symbols without increasing the total character count. My version expanded the text significantly and altered its structure.
Despite the naming inaccuracy, the method did reproduce an entropy curve similar to the Voynich CUVA profile, especially in the characteristic “bump” around n=3–6. The results still support the hypothesis that some kind of structured substitution — possibly at the syllable or morph level — could account for the entropy behavior in the Voynich manuscript. However, any conclusions should be interpreted with this clarification in mind.
You can also check this post of mine where you can see the entropy bump comparing the MS in EVA and in CUVA versus natural languages texts:
You are not allowed to view links. Register or Login to view.
Maybe by accident, I’ve pulled on a thread worth following — I’ll keep exploring what really generates the bump.
------------------------------------------
In this experiment, I tried to simulate how different historical ciphers affect the entropy profile of a text, and compare the results to the Voynich CUVA (explained here You are not allowed to view links. Register or Login to view. by René Zandbergen). The idea was to test whether the statistical behavior of the Voynich text—especially its distinctive “entropy bump”—could emerge from known cipher types.
Method
I took the Latin text De Docta Ignorantia and applied 10 classical cipher transformations likely known or possible in the 15th century:
- Syllabic substitution
- Homophonic cipher
- Caesar cipher
- Grammatical expansion
- Transposition cipher
- Contextual substitution
- Polyalphabetic cipher
- Cardano grille
- Relative-position encoding
I then plotted these values against the Voynich CUVA section.
![[Image: uLOSZCq.png]](https://i.imgur.com/uLOSZCq.png)
This graph shows that most cipher types produce entropy curves that drop steeply after n=3–5, while the Voynich text declines gradually and smoothly. This is already unusual.
But there's one exception...
Homophonic cipher anomaly
Only the homophonic cipher (3+ variants tested) produces an entropy “bump” that matches the Voynich profile. Specifically, when using a homophonic cipher with 3 or 4 characters per symbol, the entropy curve is smoother and shows a slow decay, similar to the CUVA data.
This raises two hypotheses:
- A system with homophonic encoding of syllables or morphs could recreate a Voynich-like structure.
- The smoothness of the curve may suggest internal rules or language constraints, not just random substitution.
![[Image: 4aEWNbM.png]](https://i.imgur.com/4aEWNbM.png)
Notice how the 3- and 4-character homophonic ciphers almost replicate the Voynich curve — both in shape and range. The 2-character version decays a bit faster but still mimics the bump.
Natural text vs. Voynich
To test if this was just a quirk of De Docta Ignorantia, I took four different natural texts (Latin, French, English):
- Ambrosius Medionalensis In Psalmum David CXVIII Expositio (Latin)
- La reine Margot (French)
- Romeo and Juliet (English)
- De Docta Ignorantia again
Each was encrypted with a 3-character homophonic cipher and compared to Voynich CUVA.
![[Image: kSTbMuI.png]](https://i.imgur.com/kSTbMuI.png)
Interestingly, when using a 3-character homophonic cipher on natural texts (Latin, French, English), the entropy curves become much smoother and more sustained. For several of them, the n-gram entropy remains high up to n=6–7, and only drops significantly past n=8 or n=9.
The curve shapes are now visibly closer to Voynich CUVA, with the most similar being De Docta Ignorantia and Romeo and Juliet. However, the Voynich text still has:
- A slightly smoother and more consistent decay, without sudden drops
- A more gradual “tail” beyond n=9, where others still not flatten or zero out (except Romeo and Juliet)
Interpretation
There are two key features that stand out:
- The “Voynich bump” (sustained entropy around n=3–6) is only replicated by homophonic substitution.
- The smoothness of the curve in CUVA suggests an underlying linguistic system — natural or artificially constructed — rather than arbitrary encoding.
It may also support theories that posit an artificial language, a constructed morphology, or template-driven word generation, all of which maintain internal consistency over longer n-grams.