Hello everyone.
Today I’d like to share one of my ideas about the Voynich script. Remembering my last experience here with you, I decided to approach it through a calligraphic analysis of the text, using an Excel spreadsheet.
After studying many EVA transcriptions and different interpretations of the Voynich manuscript, especially the work of Stephen Bax, I looked into phonetic studies and the attempts to match glyphs with specific sounds. In my opinion, those correspondences often seem sporadic and subjective. I also explored the idea of Gregorian chants derived from the manuscript.
The Voynich text has a remarkable lexical redundancy. With very few glyphs, a massive text was written, yet many pages almost feel like obsessive repetitions of nearly identical words. It gives the impression of something chant-like, melodic, repetitive — almost like a litany, a prayer, or something similar.
This aspect really caught my attention. How is it possible to generate a fully coherent text with so few letters?
I rule out the hypothesis that it was simply a fake or meaningless text created only for profit.
This is where medieval Gregorian chant comes into play. The Voynich text seems to share several similarities with it. For example, medieval Gregorian notation used four lines in the tetragram and only a few note heights. The notes, often written in square form and called neumes, were used to identify syllables. They could appear in different forms that modified pronunciation.
What interests me most is the number of lines and how, with only a few notes, an entire text could be written.
There are also compound neumes like podatus, clivis, torculus, and porrectus — groups of notes forming a single syllable — as well as simple neumes like punctum and virga, where a single note represents one syllable.
If we compare this idea with the Voynich text, the compound neumes could resemble compound glyphs such as ch, sh, ckh, cfh, cph, or cth.
Example of a neume with square annotations
While studying the Voynich script, what immediately stood out to me was the visual shape of the glyphs themselves. They reminded me of medieval Gregorian notation.
At that point, I started analyzing the glyphs and their heights, arranging them into a kind of scale. Of course, the text is written in cursive, so it isn’t as regular as Gregorian notes, but I tried to estimate the heights and see if this perspective could reveal something interesting.
Please ignore the heights marked with a “+” sign, which I used for compound glyphs with ligatures or connected forms. In Excel I couldn’t make the calculations work correctly with the “+” signs, so I had to leave them out. I suspect those glyphs may represent something additional compared to the others, but for now I don’t want to push the theory too far.
Now I’ll explain exactly what I did.
I took the manuscript and assigned heights to the glyphs: A high, M medium, B low, C short.
This is how I categorized the glyph heights.
I should point out again that the Voynich text is written in cursive. Someone might notice, for example, that in the word “ykal” I classified the glyph “l” as C, even though visually it could seem closer to B. But after checking most instances of that glyph, I noticed that, like the “y” glyph, part of it often extends below the line. In other cases, the “y” glyph appears shorter than expected.
Since I obviously cannot manually retranscribe the entire manuscript using these rules, I decided to catalog the glyphs this way and use the full EVA transcription by Takeshi, despite the errors I found in it. That’s another discussion entirely. What matters here are the calculations I’m about to explain. A few errors are acceptable
Method
What I did was import the entire EVA transcription of the manuscript into an Excel file:
You are not allowed to view links. Register or Login to view.
After that, I needed a way to understand whether this method could reveal any interesting patterns in the text. For example, I wanted to measure how frequently glyph heights changed across entire pages, to see whether there was some regularity or complete chaos — in other words, whether the glyph heights had meaning or were simply decorative or random.
Procedure
I took each EVA page and arranged it line by line into columns.
Then I created a formula that transformed the entire text into the height categories I had chosen, returning each line as a sequence of height letters.
After that, I created additional columns containing the number of A, M, B, and C heights, so I could count how often each height appeared on every line.
I also created another column to calculate the Runs for each line. By “Runs,” I mean the number of transitions from one height to another between consecutive letters.
Then I created another column for the average Runs per page, combining all the extracted data. This was the result I was most interested in: the average Runs.
I also generated columns showing the percentage distribution of each height on every page, to see how much the patterns changed from one page to another.
I even tried to create a column for entropy calculations, but I couldn’t manage to make it work.
The Excel file I generated contains all these results. At the bottom right there are the average Runs for each page. In the center there are the percentages of each height category across the entire page. On the left side, I also made a comparison using a Gregorian chant. I took the notes written on four lines and converted them into my height categories. I wasn’t interested in identifying the actual notes themselves — I simply took them in sequence. As you can see, the distribution is very balanced, and the average Runs are surprisingly close to those found in the Voynich manuscript. It is impossible to get similar results with a natural language text, even with stylized texts that visually resemble Voynich. Those tend to produce much more chaotic and irregular patterns. Going back to the Voynich data: if you look at the average Runs on each page, they are all remarkably similar. The percentages of the different height categories also remain very consistent from page to page.
Conclusion
In my opinion, these results suggest that the glyph heights are not random, and that the encoding method could perhaps be related in some way to medieval Gregorian chant notation — though I can’t say exactly how. Still, these are the patterns I found. I’d love to hear your thoughts.