The Voynich Ninja

Full Version: The Naibbe cipher
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8
(04-08-2025, 01:55 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(04-08-2025, 01:31 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Not as it stands, but maybe a variant on it could.  The playing-card mechanism is designed to impose a frequency ratio among choices from the different tables, and it's good at accomplishing that -- but, applied strictly as described, it would also create flat ciphertexts with none of the regional variation we know and love from our holidays on Tavie's island.

There could be arbitrary rules such as (1) when encoding the first line of a paragraph, draw from this table; (2) when encoding the first vord of a line, draw from that table; (3) when encoding the last vord of a line, draw from that other table.  But that doesn't strike me as a very satisfying solution, since it doesn't offer any real explanation for such a practice.

I think there is a problem with adding more and more rules. But first I have to say that the below is in not way an attempt to devalue the work on the Naibbe cipher, but just my perspective. 

The general methodological (?) problem I sense in the whole approach: if one sets out to replicate particular features, one would likely end up replicating these features, nothing less, nothing more. Given a simple analogy, if I get an F1 car and set myself on a mission to replicate its appearance as closely as possible, using modeling clay and scrap metal, then if I'm careful and accurate, I will end up with a very good replica, quite suitable for photo shoots, but I won't expect to learn a lot about what makes the F1 car a racetrack marvel.

It would be, for me personally, much more interesting find if the features of the Voynichese emerge due to some internal logic and simple constraints of an efficient encoding system. magnesium's cipher is a very good approximation and an excellent work at that, but at a cost of quite high verbosity and still quite complicated encoding/decoding process. Totally achievable with the tools available in the XV century, yes, but what would be the motivation to use this scheme?

So, yes, it's possible to add LAAFU rules and nulls, etc, etc and it is in the end quite possible to achieve a perfect replica of Voynichese. But as long as it is done by arbitrarily adding rules, I'm not sure one will learn much about the actual Voynichese.

No offense taken whatsoever. I'd encourage folks to interpret the Naibbe cipher as a reference. The model very explicitly averages across all of Voynich B, and there is considerable section-by-section variation within that portion of the text. I tried to get the cipher to a point, though, where it replicated enough that its modes of failure would be interesting. In addition, there's only so much I can do—or know how to do. So I decided to develop the cipher to the point where it did a lot of things—but not everything—very reliably and then pivoted to rigorously characterizing the properties of the resulting ciphertext. To riff on your F1 analogy: You can actually learn a lot about the aerodynamics of a car based on a clay model.

I'm beginning a follow-up analysis where I parse Voynich B as if it were literally a Naibbe ciphertext (it's not!), which yields the exact sequence of plaintext n-grams and table assignments that would have to be made to replicate the ~83% of Voynich B tokens the cipher can create. How do the properties of the implied re-spacing sequence and the table sequence vary line by line, page by page, or section by section? It's still very early days, but there are meaningful section-by-section differences in the percentage of inferred unigrams, and Scribe 2 seems to prefer writing strings of multiple inferred unigrams in a row. I'd love help with this sort of thing and would welcome people taking the cipher and running with it.
(04-08-2025, 02:02 PM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.IMHO, the process isn't that much more complicated compared to other ciphers available in the first half of the 1400s, but I agree that adding nulls and specific rules just to prove a point (e.g. the top line 'gallows' appears because of this specific type of encoding) would not be useful.

Maybe I'm misunderstanding something, but to me the process looks much more complicated than, say, the diplomatic ciphers from Tranchedino's list. There we have one table of mappings that occupies half a page, and then a small dictionary of special mappings for common or sensitive terms. The process of encoding is extremely simple, just replace a character with a ciphertext glyph of your choice, add a null now and then, special glyphs can be used for common or potentially incriminating words (the King, the Pope, the Church, etc).

For Naibbe there is a multistage process with a random source and six tables of three mappings for each character. Achieving essentially the same result - one to many substitution. Why add complexity with no obvious advantage? Encoding 240 pages is not a simple task using any cipher.

(04-08-2025, 02:02 PM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.What I find most interesting about this whole paper is the division of the plaintext into bigrams and unigrams, as well as the use of six different tables of glyphs. This makes total statistical sense and explains many of the properties of vords distribution.

This is exactly the problem I'm referring to, I'm not sure looking for things that make statistical sense makes practical sense  Smile To me it's the other way around, there should be things that make practical sense leading to things that make statistical sense.
(04-08-2025, 02:30 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.To riff on your F1 analogy: You can actually learn a lot about the aerodynamics of a car based on a clay model.

My point is exactly this: one will learn very little about aerodynamics this way. Without actually building many simple working prototypes and adjusting designs to work better, one can only learn the shape. When little kids make F1 models they replicate the shape quite well, they add the front/rear downforce wings, etc. However, these elements are usually attached in a way that wouldn't work on a real car, because without knowing the function of these elements it's hard to imagine that it actually experiences a lot of downwards force (comparable to the weight of the actual car) and its job is to transfer this force to the chassis pushing the car into the ground.

Likewise with the cipher, unless we know what specific practical purpose certain process is designed to achieve, simply trying to mimic it may give us little in terms of practical knowledge.
(04-08-2025, 02:57 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(04-08-2025, 02:30 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.To riff on your F1 analogy: You can actually learn a lot about the aerodynamics of a car based on a clay model.

My point is exactly this: one will learn very little about aerodynamics this way. Without actually building many simple working prototypes and adjusting designs to work better, one can only learn the shape. When little kids make F1 models they replicate the shape quite well, they add the front/rear downforce wings, etc. However, these elements are usually attached in a way that wouldn't work on a real car, because without knowing the function of these elements it's hard to imagine that it actually experiences a lot of downwards force (comparable to the weight of the actual car) and its job is to transfer this force to the chassis pushing the car into the ground.

Likewise with the cipher, unless we know what specific practical purpose certain process is designed to achieve, simply trying to mimic it may give us little in terms of practical knowledge.

We don't know how the VMS cipher worked or even if the VMS is a ciphertext. What is clear is that, if the VMS is a ciphertext, the cipher probably doesn't work exactly the way that well-characterized 15th-century ciphers are known to have worked. Otherwise, more progress would have been made by now on deciphering the VMS. In the absence of solid practical leads, then, we are left with the statistical properties of the VMS.

On the note of cipher simplicity, I'd love to see a fuller statistical analysis of René Zandbergen's "NOT the solution" cipher, which I learned about from René after I had devised the Naibbe cipher. René explored some similar ideas to the ones I did with verbose substitution, where there are often 2-3 glyphs mapped to a single plaintext letter: You are not allowed to view links. Register or Login to view.
Another approach that I find interesting as a thought experiment is a nomenclator (a large lost book that mapped each Voynich word to a plain text word). A problem with this is the high frequency of uncertain spaces. I think this could be a problem for the Naibbe too?
Good job magnesium.
Thanks for your work and especially the paper. So very nice to chew on something substantial, it was really good, thorough and professional.    Beer Quality research  Beer
(04-08-2025, 04:28 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Another approach that I find interesting as a thought experiment is a nomenclator (a large lost book that mapped each Voynich word to a plain text word). A problem with this is the high frequency of uncertain spaces. I think this could be a problem for the Naibbe too?

I thought about pursuing a nomenclator cipher, but I decided not to for a few reasons:

1. Historical plausibility. Robust nomenclator ciphers of the complexity needed to come close to the VMS, such as the Great Cipher (France) and the Duke of Manchester's diplomatic cipher (England), didn't hit their stride until the mid-late 17th century. Even then, these ciphers were not always used to encrypt the entirety of a given document.

2. Frequency-rank distribution of word types. As illustrated in Figure 7 of Bowern and Lindemann (2021), the proportional frequency-rank distribution of word types within Voynich B is weirdly flat relative to many natural languages. If there were truly a 1:1 relationship between VMS word types and plaintext word types, ignoring entirely the internal structure of VMS word types, then you wouldn't expect to see this anomalous behavior.

3. Cumbersomeness. If people have issues with the complexity of the Naibbe cipher, a nomenclator cipher would throw them for a loop. 

Uncertain spacing is not necessarily as much of a problem for the Naibbe cipher as you might expect. While it's true that spaces mean something important in Naibbe ciphertexts, word types are also very rigorously defined by the word grammar. So if a series of tokens are all scrunched together, you can use the word grammar as a backup to determine where tokens most likely start and end. There might be some ambiguity introduced in this scenario, however, so multiple readings would need to be tested.
(04-08-2025, 06:33 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.3. Cumbersomeness. If people have issues with the complexity of the Naibbe cipher, a nomenclator cipher would throw them for a loop. 

It would. Moreover, given the string repeat stats and no obvious context dependence of labels, I'd say a simple nomenclator (with a single index for each plaintext word) is hardly possible. If I have one strong conviction about the Voynich Manuscript as a cipher, it's that there are multiple ways of encoding the same plaintext string. It's either that or gibberish.
(04-08-2025, 06:50 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(04-08-2025, 06:33 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.3. Cumbersomeness. If people have issues with the complexity of the Naibbe cipher, a nomenclator cipher would throw them for a loop. 

It would. Moreover, given the string repeat stats and no obvious context dependence of labels, I'd say a simple nomenclator (with a single index for each plaintext word) is hardly possible. If I have one strong conviction about the Voynich Manuscript as a cipher, it's that there are multiple ways of encoding the same plaintext string. It's either that or gibberish.

Completely agreed.
Some thoughts on the Naibbe cipher.
The Naibbe cipher with 6 tables has 18 different ways to encrypt every plain text letter. 
If I have understood correctly how the randon way to choose the encryption works, in the case of simplfied spacing, it will split the size of the sample of every letter in 18 encrypted parts of equal size. 
For every plaintext letter: Are there 18 samples of equal size for the unigrams, prefixes and sufixes needed for the encription?

Page 10:
The Naibbe cipher’ sability to disguise a given plaintext bigram in one of 36(6×6) ways through letter-by-lettert ables election is by far its greatest strength

The problem is again the random encryption, the samples of the 36 ways will have the same size, a weakiness.
A non random choice would be better, the preferences of the scribes in the use on some ways over another ones would split the 
size of the 36 ways in samples of different sizes, making them more difficult to decrypt
Pages: 1 2 3 4 5 6 7 8