Hello everyone! I'm new to this site and to a lot of the research that has been done on the Voynich Manuscript. However, I found myself with some free time so I thought I'd play with some techniques I used in previous lives (I studied high energy physics and worked in Machine Learning, both a while ago). Since my knowledge and experience isn't super recent, I used the help of Claude to actually do the specific analysis, but to be clear the direction/hypothesis/questions were driven by me (and when I say "we" below I mean me+Claude). I hope this meets the community standards, and if not I'm happy to remove the post and just learn from the community.
All of the work is publicly available, including source code, a pdf writeup, and a step by step guide to what I did.
The work is available here: You are not allowed to view links.
Register or
Login to view.
The pdf is here and attached to this post: You are not allowed to view links.
Register or
Login to view.
What this is and isn't: this is not a decipherment, nor was an attempt to decipher the manuscript. It's an independent replication and methodological extension of existing statistical work asking the question "was this manuscript generated from some kind of natural language." All code is public and every result comes with a permutation-based null model, so everything is reproducible from scratch.
Two main findings:
1. Immediate word-doubling at ~2x chance (word[i] == word[i+1], within a manuscript line). This is a well-known anomaly, but I tested it more carefully than I'd seen done before, across six languages and registers with formal z-scores throughout. I used some language samples that I thought could potentially contain word doublings naturally, to compare against the manuscript, including:
- Culpeper's Complete Herbal (English, genre-matched): zero doublings, far below its own chance baseline. The hypothesis here was that if the manuscript was some kind of list (recipes, etc) perhaps a contemporaneous similar list could show the same pattern.
- Carmina Burana In Taberna (Latin verse, maximally repetition-saturated, "bibit" repeated 24 times consecutively): zero doublings. The hypothesis here was that perhaps a contemporaneous document that was more poetic in nature could show similar doublings.
- Arabic Al-Baqarah and Yusuf (consonantal): at or below chance. Arabic and Hebrew were including to look at languages that don't contain consonants and could show patterns that don't exist in long-form Latin languages because words could have multiple meanings.
- Hebrew Psalms 113-150 / Hallel (liturgical Hebrew poetry): z = -2.87, p = 0.998- significantly SUPPRESSED below chance. The hypothesis here was that perhaps poetry in a language with a different structure could show similar doublings.
- Hebrew Psalms full 150 chapters (19,662 words): z = -3.73, p = 1.000- even more strongly suppressed
- Sanskrit Rigveda (complete, 135,279 words, IAST romanization): z = +3.04, p = 0.002- elevated above chance, though we flag this needs verification due to potential IAST sandhi encoding artifacts. Same hypothesis as liturgical Hebrew.
The Hebrew liturgical result is the one I find most interesting. Psalm 136 repeats the same phrase 26 times. Psalm 113 opens "הללו יה הללו". And yet the doubling rate is significantly BELOW chance — because real liturgical repetition always inserts at least one different word between repetitions of
the same phrase. "הללו יה הללו" is not "הללו הללו." That's also exactly what Carmina Burana does: "bibit hera, bibit herus-" same verb, different subject every time. Real repetition, whether poetic or liturgical, doesn't produce literal self-adjacency. Voynichese does, at ~2x chance, stationarily across the full manuscript.
The effect also survived a cross-section permutation test (p = 0.144 for the section gap — consistent with a single stationary process throughout), and a
local scramble test showing it depends on line co-membership rather than exact write-order.
2. Word-class ordering asymmetry — new to our analysis. Classifying words by their 2-character suffix (following Stolfi's grammar framework) and measuring directional ordering preferences between classes, both Currier languages show significantly asymmetric ordering: z = 4.72 for Language A, z = 4.77 for Language B, both p < 0.0001, robust to a frequency-matching control. Every other structural test returned null — this is the first positive,
grammar-suggestive result. I want to be careful: positional structure doesn't prove meaningful language. But it is something simple stateless generators don't produce by default, and the effect is equally strong in both Currier languages on the full validated corpus.
On the corpus: I used the complete FSG transliteration (Reeds 1994, from voynich.nu), validated word-for-word against the source file. In the process
we found that several Stars-section folios in our working corpus had been severely truncated (some folios had only 20-25% of their true content).
After rebuilding from the full authoritative file the corpus grew from ~14,000 to ~33,600 words (~88% of the manuscript). All findings were re-validated on the full corpus; the word-class asymmetry finding in particular changed meaningfully — what had looked like a gap between Language A and Language B disappeared entirely, with both now showing indistinguishable z-scores.
Happy to discuss methodology, findings, or criticisms. Hopefully it's clear what does and doesn't hold up (the Sanskrit result needs verification; an
earlier attempt to add corpus data produced fabricated text that was caught and removed before any analysis used it, and is documented in the repo). The GitHub repo has full documentation of every finding including the ones that failed or changed.
Thanks everyone, and looking forward to discussing.