The Voynich Ninja
A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Voynich Talk (https://www.voynich.ninja/forum-6.html)
+--- Thread: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code (/thread-4891.html)

Pages: 1 2


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Simona Vargiu - 27-08-2025

Hi Tavie,

thank you very much for your comment. Let me clarify two points:


  1. On DNA
    When I mention DNA, I do not mean that the manuscript literally describes genetic sequences. Rather, I noticed that the manuscript contains repeated structures and a kind of progression (from microscopic/biological motifs to plants and more complex illustrations). This gave me the idea to test whether there might be an underlying scheme. I chose DNA as a heuristic model — a modern lens — to measure statistical coherence and possible evolution in the text. It is not a historical claim, but an investigative approach.
  2. On the role of ChatGPT
    ChatGPT did not invent or interpret the results on its own. I developed the idea, chose the hypothesis, and guided the interpretation. ChatGPT supported me as a technical tool: running statistical calculations in Python (e.g., Spearman correlations, Monte Carlo permutations, LZ76 complexity, run-length analysis), preparing tables, helping with translations, and formatting the Word document. In other words, I provided the direction, and ChatGPT carried out the computations and organization of the results.
Best regards,
Simona


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Rafal - 28-08-2025

Well, I'll try to sum it up.

The author of VM certainly wasn't inspired by DNA as it was discovered only in the 20th century. I hope we all agree here  Wink
But of course he could create by accident something working in a similar way.

The problem is that even today we very poorly understand how DNA works. How some sequence of proteins makes for example one person brilliant and another person dumb. We can spot correlations - some sequence of DNA correlates with some genetic disease so it must mean that disease. But if I gave you some random DNA sentence, there isn't any modern scientist in the world who would say you what it means. So I would say DNA isn't a good model of anything because it actually doesn't help to make any predictions.

Actually I also believe that science claims that 90% or so of DNA in living organisms is useless crap  Smile It is an inactive code that never gets executed and doesn't make any sense. If it was to be executed then the organism probably wouldn't survive embryo stage. In that way it may turn out to be a good analogy of VM which may actually to be nonsense too  Big Grin

But if I get the used metodology right, in practice it just means that Voynichese is reduced to 4 symbols, like 4 proteins building up DNA. Let me quote:

Mapping: 'qokedy' → AGGATA; 'olchedy' → TCGATA; 'qokeedy' → AGGAATA.

Does it really help with anything? We just lose a lot of information. And what actually AGGATA means? Is it a word? Or some instruction for a machine?


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Simona Vargiu - 28-08-2025

Ciao Rafal,

thank you very much for your thoughtful comments. 
I completely agree with you that the author of the VM could not have been inspired by DNA — that concept only appeared centuries later  Wink
My use of DNA was never meant as a literal claim, but simply as a heuristic reduction: by mapping EVA onto four symbols (A,C,G,T) I could apply well-known statistical tools (adjacency, transition χ², LZ76, run-length) to test whether the text shows internal order.

The real question is not “does the Voynich encode DNA?” but rather :

1) 1Is there a hidden scheme in the Voynich text?
2) Can this be demonstrated mathematically and statistically?

From this perspective, it does not matter what “AGGATA” means biologically (indeed, even in real genetics short fragments are uninterpretable). What matters is whether sequences behave in a structured way compared with shuffled text or Latin/vernacular controls. And here the answer is yes: the Voynich pages show consistent, progressive patterns, while controls remain flat.
So the DNA model was just one lens to reveal this structure. What I find most striking is that across sections the manuscript exhibits a kind of statistical linearity — a progression page by page — which suggests that the text is not pure nonsense, but governed by an internal system.


Would you agree that, regardless of the specific model chosen, the important outcome is this evidence of order and progression?

Best regards,
Simona


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - magnesium - 28-08-2025

This general approach reminds me a bit of Schinner (2007): You are not allowed to view links. Register or Login to view.

Instead of mapping Voynichese "letters" to strings of four "bases," Schinner converts the text to a binary string, where 1 EVA "letter" is a unique string of 5 1s or 0s. Binary strings would be better suited to statistical analysis than strings of ATGC "bases" because if you use something like the EVA alphabet as it is, you'd need a string of 3 bases (4x4x4) to give yourself full coverage of EVA letters (q = AAA, o = AAT, etc.). However, that would yield a total of 64 possible base strings, more than twice the coverage you actually need, which would mean that the results of any downstream statistical analysis would depend in part on which specific base strings you randomly assigned to specific EVA letters. By contrast, a string of 5 1s or 0s (q = 00000, o = 00001, etc.) gives full coverage and doesn't overshoot too much, thereby making those random letter-by-letter assignments less of an issue.

The other benefit of converting straight to binary, rather than strings of letter bases as this approach takes, is that you can immediately proceed to numerical analyses of the text, such as a random walk analysis as Schinner (2007) performed. It's on this basis that Schinner (2007) identifies long-range correlations in the Voynich Manuscript that are far stronger than those of the reference texts he analyzed the same way. He takes this to be evidence that the text of the Voynich Manuscript is iteratively generated gibberish, and therefore that the manuscript may have been hoaxed.

My overriding concern with this "DNA-inspired" analysis is that I'm not sure it would yield anything that previous bit-level analyses of the Voynich Manuscript haven't already found. But the fundamental intuition is sound. In fact, the random walk analysis that Schinner (2007) performs was originally developed to study DNA: You are not allowed to view links. Register or Login to view.


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Rafal - 28-08-2025

Hi Simona!

Now I think I understand your approach. You don't claim that Voynich Manuscript was inspired by DNA structure.
You just use mathematical models that were used previously with DNA and see what happens.
So good luck with it! Let us know if you have any interesting results.


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Simona Vargiu - 29-08-2025

Thank you very much, @magnesium 
This is exactly the kind of point I was trying to make. For me there does seem to be a correlation, but I should add that I have only been looking into the Voynich Manuscript for about four days now.
What brought me here is simply that I have the “capacity” (or maybe the habit) of seeing patterns, and when I stumbled upon the manuscript I couldn’t resist trying to “decode” it, because those repeated shapes in the words looked like they might reflect an underlying structure. My approach was to check whether a mathematical conversion could highlight this order.
Your reference to Schinner (2007) is very helpful, because it places this kind of attempt in the context of earlier work — and that was exactly what I was hoping to learn from the discussion.

Best Smile
Simona


RE: A New Interpretation of the Voynich Manuscript: A Statistical Evolutionary Code - Simona Vargiu - 29-08-2025

Hi Rafal,
Yes, exactly — you’ve understood my approach very well. I don’t claim that the Voynich Manuscript was inspired by DNA, only that I use some of the mathematical tools originally applied to DNA sequences to test whether the text contains internal order.
One encouraging result so far is that when we ran falsification tests (random shuffles, Latin/vernacular controls, lorem ipsum), the Voynich pages consistently showed significant trends, while the controls remained flat. That gives some confidence that the patterns are not just an artifact of the method.
I would be very happy to assemble some of these results and share them here soon. It’s really a pleasure to exchange ideas in this way.


Preliminary summary — page-by-page trends (Spearman ρ vs page index)

Metric                | Voynich ρ | Voynich p | Controls ρ | Controls p
----------------------|-----------|-----------|------------|-----------
AT/GC adjacency      | 0.580    | 0.001    | 0.041      | 0.612
4×4 transition χ²    | 0.640    | 0.000    | 0.018      | 0.784
LZ76 complexity      | 0.610    | 0.000    | 0.027      | 0.701
Mean run-length      | 0.570    | 0.002    | −0.035    | 0.657


Interpretation (one line): Voynich pages show consistent, significant trends, while shuffled/Latin/vernacular/lorem controls remain flat and non-significant.

Best
Simona