In this post, I’ll walk you through a machine learning approach I used to analyze the Voynich Manuscript using
character-level n-gram language models, a simple but powerful way to measure how predictable a text is. My goal was not to decode the Voynich, but to
compare its statistical structure to that of other known texts — including literary works, religious treatises, and artificially encrypted versions — to see if it behaves like a natural language, a cipher, or something entirely different.
What Are Character-Level N-grams and Perplexity?
Before diving into the results, let’s quickly explain two key concepts:
- Character-level n-grams: These are sequences of n consecutive characters. For example, in the word "language", the 3-grams (trigrams) are
lan
ang
ngu
gua
uag
age
- An n-gram model learns the likelihood of seeing a particular character given the previous n-1 characters.
- Perplexity: This is a measure of how well a model predicts a sequence. Low perplexity means the model can easily predict the next character — the text is “regular” or “learnable.” High perplexity means the text is less predictable, like a noisy or complex system. It’s often used to evaluate how well a language model fits a dataset.
The Experiment
I trained simple n-gram models (from 1-gram to 9-gram) on the following types of texts:
- Classical literature (e.g., Romeo and Juliet, La Reine Margot)
- Religious and philosophical texts (e.g., Ambrosius Mediolanensis, De Docta Ignorantia), with date of creation simmilar to the MS
- Ciphered texts using a Trithemius-style letter substitution
- The Voynich Manuscript, transcribed using the EVA alphabet
For each text, I split it into a training and validation set, trained n-gram models by character, and computed the perplexity at each n-gram size. I plotted these to visualize the predictability curves.
What Did I Find?
The results were surprising:
- The Voynich Manuscript exhibits surprisingly low perplexity for high n-grams (n=7 to n=9) — much lower than expected for a truly random or strongly encrypted text.
- Its perplexity curve closely resembles that of religious or philosophical medieval texts, such as De Docta Ignorantia and Ambrosius Mediolanensis. These texts also show low perplexity at high n-grams, reflecting strong internal regularity and repetitive patterns.
- In contrast, literary texts like Shakespeare or Dumas show a sharp increase in perplexity for high n-grams, indicating a richer and more unpredictable sequence of characters.
- Artificially encrypted texts using simple substitution ciphers (like Trithemius-style transformations) show consistently high perplexity, since character distributions are scrambled.
Interpretation
This suggests something important:
The Voynich Manuscript does not behave like a substitution cipher or a natural literary language. Instead, it statistically resembles structured, repetitive writing such as liturgical or philosophical works.
This does
not mean it’s meaningful — but it does imply that the text might have been
designed to look structured and formal, mimicking the style of medieval sacred or scholarly texts.
Its internal predictability could arise from:
- Repeated formulas or ritualistic phrases
- A constrained or templated grammar
- Artificial generation using consistent rules (even if meaningless)
Conclusion
While many have tried to translate the Voynich Manuscript into known languages or decode it with cipher-breaking techniques, this analysis suggests that
a direct translation approach may be futile. The manuscript’s character-level structure mirrors that of repetitive, highly formalized texts rather than expressive natural language or encrypted writing.
Any attempt to decipher it without first understanding its generative rules — or lack thereof — is likely to miss the mark.
That said, its statistical behavior is
not unique. Other texts from the same era show similar n-gram patterns. So perhaps the Voynich isn’t a hoax — it might just be mimicking the structure of sacred or scholarly texts we no longer fully understand.