(03-08-2025, 10:24 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view. (03-08-2025, 06:45 PM)magnesium Wrote: You are not allowed to view links. Register or Login to view.The whole journey of deriving the cipher began with a desire to reliably replicate the VMS's observed token and type length distributions while also obeying the text's word grammar and entropy. The structure of the cipher emerged from there.
Can this approach explain line-as-a-functional-unit properties, such as the tendency of certain characters and combinations to appear near/at the beginning or end of lines?
The short answer: Big-picture, the cipher can theoretically accommodate the VMS's line, paragraph, and page properties, but it currently lacks mechanisms that reliably produce them. This is a known limitation of the current version of the Naibbe cipher (see Section 4.3 of the paper), and I see it as an important area for future investigation.
The long answer: the Naibbe cipher can generate ~5/6 of the tokens in Voynich B, so it can generate stretches of text that exactly replicate what's seen in the VMS. But the Naibbe cipher is, first and foremost, meant to replicate the word-level properties of the VMS. Within the Naibbe cipher, a given ciphertext token stands for 1 or 2 plaintext letters, achieved by re-spacing a Latin or Italian plaintext roughly 50-50 into unigrams and bigrams and then encrypting those n-grams by selecting substitution options from 6 different tables on a letter-by-letter basis.
As a result, the structure of a given line of Naibbe ciphertext depends on three things simultaneously:
1. The exact content of the plaintext
2. How exactly the plaintext is re-spaced into unigrams and bigrams
3. The exact sequence of tables used to encrypt the text on a letter-by-letter basis
As to some of the specific line-as-a-functional-entity features of the VMS, I think some judiciously placed nulls could go a long way. The Naibbe cipher as I described it in the presentation and paper does not use any nulls, but we could certainly extend the cipher to include some. For example, we could arbitrarily designate that a gallows glyph or prefix (e.g.,
pch) beginning a "paragraph" is a null that's just meant to set off an apparent paragraph. Similarly, we could designate that line-ending tokens that end with -
m are treated as nulls that simply serve to pad out the line length.
There are also other modifications that could be made. In the current version of the Naibbe cipher, plaintext re-spacing occurs completely randomly (i.e., not a systematic spacing system like putting a space before/after every vowel), as does table selection. But in principle, neither of these have to be fully random, which could potentially accommodate some VMS properties, such as the word length correlation. In addition, one could imagine that the first line of a paragraph is encrypted slightly differently than the rest of a paragraph, such as by happening to favor a table with a higher incidence of
p than other tables have. I should also re-emphasize that the structure of the Naibbe cipher is based on average stats across all of Voynich B, which definitely smooths over some section-by-section differences and mushes together what seem to be the distinct preferences of Scribes 2 and 3.
The Naibbe cipher isn't perfect, but it's a place to start. I'd love to collaborate with folks and further investigate whether and how the Naibbe cipher can be extended/modified to accommodate the VMS's line-level properties. Part of this work, I suspect, will involve screening for
plaintext properties that make those line-level statistics more or less likely.