Have we ruled out simple substitution unwisely?

Have we ruled out simple substitution unwisely? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Have we ruled out simple substitution unwisely? (/thread-5344.html)

Pages: 1 2 3 4 5

RE: Have we ruled out simple substitution unwisely? - eggyk - 10-02-2026

(10-02-2026, 02:17 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Just to remind people that the anomalously low entropy was discovered in 1976, and Eva was first introduced in 1999.

For the impact of the transliteration language on entropy, see Figure 12 (with surrounding text) on You are not allowed to view links. Register or Login to view..

I don't mean to specifically criticise any transliteration alphabet or to blame EVA specifically, to be clear. I think that the work done there has incredibly useful and valuable and I appreciate all of it.

But hypothetically speaking, what if each of those transliteration alphabets share some underlying aspect that causes the issue? These results would then due to a systematic error instead of the VMS itself. It would be good to rule this out as far as possible.

(10-02-2026, 07:51 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.For those who think character entropy issues are not inherent to Voynichese, I can only encourage you to experiment with this yourself. That's the best way to get a feel for the problem.

It's quite possibly (or probably) the case that there is an inherent entropy problem in the text. If there are changes that can be made to our transliterations, however, that bring things closer to natural languages as a starting point that could have implications in VMS research.

Honestly, even work that concludes in "I tried lots of new changes to transliteration and the entropy problem still didn't go away" is useful science, i think.

RE: Have we ruled out simple substitution unwisely? - ReneZ - 10-02-2026

(10-02-2026, 08:39 AM)eggyk Wrote: You are not allowed to view links. Register or Login to view.But hypothetically speaking, what if each of those transliteration alphabets share some underlying aspect that causes the issue? These results would then due to a systematic error instead of the VMS itself. It would be good to rule this out as far as possible.

The core question is: "what is a single character".
We don't know the answer, but the Eva and Currier alphabets are on quite opposite sides of the spectrum.

RE: Have we ruled out simple substitution unwisely? - eggyk - 10-02-2026

(10-02-2026, 10:45 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The core question is: "what is a single character".
We don't know the answer, but the Eva and Currier alphabets are on quite opposite sides of the spectrum.

I don't agree that that is the core question. In both cases a single interpretation for clusters of unclear strokes has been chosen.

If I understand correctly, where Currier would use "N" and "M", EVA uses "in" and "iin".

My (not fleshed out) idea acknowledges that in a case where plaintext Latin script "i, n, m" are simply a function of how many i strokes are present, allocating a single value may not be appropriate.

To demonstrate what I mean, we can say:

Plaintext "i" = i (or n at word end)
Plaintext "n" = ii (or in at word end)
Plaintext "m" = iii (or iin at word end)

Both currier and EVA would treat iin as the same every single time. But under these rules set out just above, we can't possibly say exactly what the plaintext was.

In reality, it was either:
"iii" i + i + n
"in"   i + in
"ni" ii + n
"m"  iin

The characters "minim" iiiiiiiiin , "immin" iiiiiiiiin , "nimnii" iiiiiiiiin all appear the same.

For the purposes of communicating about voynich, there isn't any issue. However, using the transliteration alphabets directly to calculate entropy ignores issues like this.

RE: Have we ruled out simple substitution unwisely? - Koen G - 10-02-2026

Again, I encourage you to play around with this to get a feel for what works and what doesn't. Whatever you do at a given location in a word (in this case [in]-clusters at the end), will still result in severe positional constraints. You'd also have to randomly pick how to parse any given [in]-cluster. This can be simulated easily by sectioning your corpus, processing each part in a different way, then putting them back together.

RE: Have we ruled out simple substitution unwisely? - oshfdk - 10-02-2026

(10-02-2026, 12:56 PM)eggyk Wrote: You are not allowed to view links. Register or Login to view.For the purposes of communicating about voynich, there isn't any issue. However, using the transliteration alphabets directly to calculate entropy ignores issues like this.

Aye to that. I don't think transliteration based entropy computations can be used as a definitive proof of anything. Still this doesn't mean simple substitution looks feasible.

EDIT: just to clarify, I'm not saying transliterations are useless, they are extremely useful. But a lot of decisions (like whether this is a or o, r or s, whether there are 1 or 2 types of l, etc) are made when interpreting a sequence of glyphs in terms of EVA or another scheme, and it's quite possible that these decisions distort entropy in all kinds of ways.

RE: Have we ruled out simple substitution unwisely? - eggyk - 10-02-2026

(10-02-2026, 01:19 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Again, I encourage you to play around with this to get a feel for what works and what doesn't. Whatever you do at a given location in a word (in this case [in]-clusters at the end), will still result in severe positional constraints. You'd also have to randomly pick how to parse any given [in]-cluster. This can be simulated easily by sectioning your corpus, processing each part in a different way, then putting them back together.

Yes, i've been thinking about different experiments to do this, random chance, chance based on potential plaintexts etc.

Unfortunately i've ended up trying to defend the examples i gave instead of talking about the class of examples itself. The example I've been explaining does do something differently to the existing methods, namely acknowledging that groups of strokes within a transliteration alphabet may correspond to multiple values.

Those types of ideas are what I wanted to explore here.

RE: Have we ruled out simple substitution unwisely? - MarcoP - 10-02-2026

(10-02-2026, 12:56 PM)eggyk Wrote: You are not allowed to view links. Register or Login to view.Plaintext "m" = iii (or iin at word end)

~100% of 'm' occur after a, and/or are word-final. Natural languages are not like that

RE: Have we ruled out simple substitution unwisely? - Jorge_Stolfi - 10-02-2026

(10-02-2026, 02:09 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.~100% of 'm' occur after a, and/or are word-final. Natural languages are not like that

In Mandarin 100% of "ng" are word-final. In Spanish 100% of "ü" come after "g". In Greek 100% of "ς" are word-final. In Portuguese "h" comes only after "c", "n", or "l", or is word-initial. In English 100% of "tc" occur afer "e" and are word-final...

All the best, --stolfi

RE: Have we ruled out simple substitution unwisely? - MarcoP - 10-02-2026

Yes, one can cherry pick a few similar things for certain languages, but in Voynichese they are systematic (hence the low entropy). Once a certain Jorge Stolfi even wrote a simple grammar that defines the constrained nature of most Voynichese words. I don't know about mandarin, but you cannot do that for European languages.

Also, you had to cherry pick bigrams instead of characters... this tells a lot

RE: Have we ruled out simple substitution unwisely? - Grove - 10-02-2026

(10-02-2026, 02:09 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.
(10-02-2026, 12:56 PM)eggyk Wrote: You are not allowed to view links. Register or Login to view.Plaintext "m" = iii (or iin at word end)

~100% of 'm' occur after a, and/or are word-final. Natural languages are not like that

This assumes the space equates to a word boundary. Theoretically, couldn’t the spacing and character relationship be a feature of an encryption scheme?