The Voynich Ninja - How to decipher the MS?

Pages: 1 2

(15-08-2025, 10:13 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.If you already know the mechanics of the cipher (for example, the Naibbe cipher) and have some hypothesis about the plaintext language, then you can perform some form of statistical analysis (for example, simulated annealing) using the known bigram and trigram statistics of the plaintext language. This is not particularly complicated and setting it up won't take much time, the latest versions of ChatGPT/Claude/DeepSeek/Gemini should be able to create the code and you can run it with some test encoding scheme to ensure it works.

I think this approach is sensible. My impression is that, at the moment, we are not in a position to apply this to the Voynich manuscript, since we have no realistic candidate for the mechanics of the cipher. To have any hope at all, the candidate should explain all the properties of Voynichese.

A second (much lesser) problem is that decipherment could be computationally hard. If the search space is large (and it is, for something like a XV Century diplomatic cipher), simulated annealing can either take ages or be bound to be trapped in local minima. LLMs can help building a quick code, but efficient processing requires heuristics and ad-hoc methods that are more difficult to define and implement. But, while humans search for candidate ciphers, AI can probably evolve its coding capabilities.

(17-08-2025, 02:58 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.I think this approach is sensible. My impression is that, at the moment, we are not in a position to apply this to the Voynich manuscript, since we have no realistic candidate for the mechanics of the cipher. To have any hope at all, the candidate should explain all the properties of Voynichese.

I think "explain all the properties" is a bit restrictive. It should be compatible with these properties, but we have no idea which of the properties emerge from the mechanics or the cipher, which ones appear because of the features of the plaintext and which ones are just stylistic preferences of the scribe.

Also as I found out in You are not allowed to view links. Register or Login to view., it could be possible to decipher a message without fully understanding the mechanics of the cipher. Just uncovering some random 10% of the plaintext using the most frequent combinations, can be enough to prove that the text is meaningful and open the way to reverse engineer the rest of the cipher.

For example, imagine that the Naibbe cipher was used to encode the Voynich MS, but we didn't know that. So, we just isolated the most common substrings and ran the decoding matching just these elements to Latin bigrams/trigrams, then with some probability we'll get most of the first table of the Naibbe scheme, that corresponds to 5/13 of the plaintext. If we uncover 5/13 of the plaintext, it would be immediately obvious that there are a lot of coherent grammatically correct snippets, too many for this to be coincidental. With this we can get enough context to deduce the rest of the scheme. Only after that we will be able to identify the way the statistics of the Voynichese emerge from the Naibbe cipher.

(17-08-2025, 03:27 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.imagine that the Naibbe cipher was used to encode the Voynich MS, but we didn't know that. So, we just isolated the most common substrings and ran the decoding matching just these elements to Latin bigrams/trigrams, then with some probability we'll get most of the first table of the Naibbe scheme, that corresponds to 5/13 of the plaintext. If we uncover 5/13 of the plaintext, it would be immediately obvious that there are a lot of coherent grammatically correct snippets, too many for this to be coincidental. With this we can get enough context to deduce the rest of the scheme. Only after that we will be able to identify the way the statistics of the Voynichese emerge from the Naibbe cipher.

Imagine we have correctly guessed how the cipher system works, then we can decode the manuscript. Yes, I agree.

(17-08-2025, 04:45 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Imagine we have correctly guessed how the cipher system works, then we can decode the manuscript. Yes, I agree.

I think my example wasn't clear. What I meant is that it could be possible to decode some ciphers without understanding how they work. Using some kind of pattern matching.

(17-08-2025, 04:56 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(17-08-2025, 04:45 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Imagine we have correctly guessed how the cipher system works, then we can decode the manuscript. Yes, I agree.

I think my example wasn't clear. What I meant is that it could be possible to decode some ciphers without understanding how they work. Using some kind of pattern matching.

I think I get your point. I agree that the guess might not have to be perfect, but I think it should be quite close to give solid results, and a good deal of understanding is necessary in my opinion.
Possibly the decipherment of Zodiac 340 could apply as an example of what you describe: if I remember correctly, they made a first guess that led to extracting some fragments, and then they managed to write code to explore a wide range of more complex transpositions.
Voynichese has a bunch of patterns that typically are not language-like, so they could all result from the cipher / writing-system / author-preference rather than from the underlying text. As we know, a direct matching of those patterns to a natural language does not work. For instance, thanks to the low entropy, I think it wouldn’t be difficult to map 5/13 of the text to statistically sound English, e.g. with a verbose substitution, it is not very different from what “solvers” do (but they are happy to ignore that the remaining 8/13 of the text is off). As for the solvers, the result would not be a coincidence, but a consequence of how mappings are chosen to match English. The real work starts when one tackles the remaining 8/13. For the Zodiac, things were different, since the cipher did not show obvious patterns.

As long as I know, the Zodiac 340 cipher was known to be written in English. In the Voynich, we have no clue in which language it was originally written, if it was. So the chances to decipher it, if it is a cipher, are very low...

(17-08-2025, 08:12 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.As long as I know, the Zodiac 340 cipher was known to be written in English. In the Voynich, we have no clue in which language it was originally written, if it was. So the chances to decipher it, if it is a cipher, are very low...

I agree that the chances of decipherment are low, but I don't think the issue is the unknown language. If it's a cipher, candidate languages are obvious and not many. One can follow the approach that oshfdk described at #2. The problem "only" gets linearly more computationally intensive with the number of languages. The effort of guessing the correct cipher system doesn't change significantly, in my opinion.

(17-08-2025, 03:27 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.For example, imagine that the Naibbe cipher was used to encode the Voynich MS, but we didn't know that. So, we just isolated the most common substrings and ran the decoding matching just these elements to Latin bigrams/trigrams, then with some probability we'll get most of the first table of the Naibbe scheme, that corresponds to 5/13 of the plaintext. If we uncover 5/13 of the plaintext, it would be immediately obvious that there are a lot of coherent grammatically correct snippets, too many for this to be coincidental. With this we can get enough context to deduce the rest of the scheme. Only after that we will be able to identify the way the statistics of the Voynichese emerge from the Naibbe cipher.

One of the reasons why I created the Naibbe cipher is to provide a sandbox in which to test these kinds of ideas. The Naibbe cipher is a fully functional cipher that by design does a good job of replicating many, but not all, word-level VMS properties. What happens when we analyze Naibbe ciphertexts as if we don't know how the cipher worked, and then we peek under the hood to see why we obtained the results we obtained? I'd encourage you to attempt this very analysis on the 20 reference Naibbe ciphertexts I have provided.

(18-08-2025, 12:37 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.One of the reasons why I created the Naibbe cipher is to provide a sandbox in which to test these kinds of ideas. The Naibbe cipher is a fully functional cipher that by design does a good job of replicating many, but not all, word-level VMS properties. What happens when we analyze Naibbe ciphertexts as if we don't know how the cipher worked, and then we peek under the hood to see why we obtained the results we obtained? I'd encourage you to attempt this very analysis on the 20 reference Naibbe ciphertexts I have provided.

I'm not sure this experiment would teach me anything. The idea is probabilistic in nature, it's not guaranteed that any particular cipher can be decoded without fully identifying its internal workings and it's obvious that some ciphers can be decoded without fully identifying their mechanics. The fact that some particular cipher can or cannot be cracked this way won't mean anything for the scheme used in the Voynich Manuscript, as far as I understand.

Pages: 1 2