The Voynich Ninja
[split] Verbose cipher? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: [split] Verbose cipher? (/thread-3356.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13


RE: [split] Verbose cipher? - Koen G - 23-09-2020

Luckily the rigorous constraints of science are there to help combat some of these biases. Double blind studies, repeatability, peer review... Unfortunately these aren't always applied as required.


RE: [split] Verbose cipher? - RobGea - 19-05-2022

Hi folks,
There are some very informative posts in this thread ( that i don't really understand ) and
previously nickpelling suggested an outline of a program:
You are not allowed to view links. Register or Login to view.

I have made some code along those lines, just a brute-forcer using h2-h1 as a score
and it produces some peculiar results but i'm not sure how to proceed.

How do we get from a sort of 'entropy calculator' to a verbose cipher ?


RE: [split] Verbose cipher? - MarcoP - 19-05-2022

Could you please share an example of your results? A possibile approach: sequences that, replaced with a single character, reduce entropy may potentially represent a single plain text character.


RE: [split] Verbose cipher? - Koen G - 19-05-2022

Isn't what Nick describes kind of the same as what I've been doing (Entropy Hunting posts)? I think we are kind of collectively moving on to the next problem now, which is spaces: if VM spaces are spaces and VM words are words, then these words are not generated by a verbose cipher.

Or, to put it differently, if VM words are words, they are too short to contain enough information.

Therefore, the most pressing question in the "verbose cipher" approach should now probably be: how to deal with spaces?



Regarding h1 and h2, I still think the best approach is to plot both on a scatter plot. There is agreement that h1 and h2 should both be tracked, but no indication whether subtracting or dividing them is really the best way to compare them. Essentially, we are looking to approach Voynichese's statistics, not their difference or proportion.


RE: [split] Verbose cipher? - RobGea - 19-05-2022

(19-05-2022, 09:50 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Could you please share an example of your results?
Sure, no problem,
the program is simply a modified version of the entropy calculator its nothing very special, it was created to have some base code that i could improve and evolve.

Here using only digrams -with spaces in the digrams allowed- , it gets to about 11 replacements then
starts using spaces and then the word length start to increase tremendously, at 20 replacements
it produces words -if they can be called that anymore- that are 70+chars long.

Using the Recipes / Stars section ZL.2a ( no words with '?', uncertain spaces as spaces)
11 replacements
Code:
Alphabet:    '+0123456789acdefghijklmnopqrstuxy  uses 35 characters
How many n-grams: 486
best_replacement: 'am'

h0: 5.169925001442312    h1: 3.9032675101280243    h2: 2.486713645582145    h1-h2: 1.4165538645458793

p0ed+ 3dy yte0yp0y otey 5teey 5t+ 3edy y3d+ d6 ok+ d+dy d6 3ek 0cphhdy d+oky op0edy pe3ol 0ep 7 ot0y s+ lkeey s7 6 0edy y3d6 3eek 0eoty eeok+ 0edy 0ckhy or orol ok4 ee+ ot k7 ot7 0+ y0edy 8edy okedy 8

12 replacements
Code:
Alphabet:    '+0123456789@acdefghijklmnopqrstuxy  uses 36 characters
How many n-grams: 502
best_replacement: 'y '

h0: 5.20945336562895    h1: 4.069233983222757    h2: 2.669898154770486    h1-h2: 1.399335828452271

p0ed+ 3dAyte0yp0AoteA5teeA5t+ 3edAy3d+ d6 ok+ d+dAd6 3ek 0cphhdAd+okAop0edApe3ol 0ep 7 ot0As+ lkeeAs7 6 0edAy3d6 3eek 0eotAeeok+ 0edA0ckhAor orol ok4 ee+ ot k7 ot7 0+ y0edA8edAokedA8eeAokeA0d7 ol lotA

Quote:Isn't what Nick describes kind of the same as what I've been doing (Entropy Hunting posts)?
From what i can tell, yes.

The way i figure it is, we use these entropy measures to identify Voynich glyphs that can be replaced.
Then somewhere down the line we replace those glyphs with some n-grams from our preferred language and Bingo !
Thats my current understanding of the verbose-cipher attack, quite how wrong i am i don't know.

I used h1-h2 as a metric because it seems to help keep h1 in check, at least a little, compared to just using h2
and it was simple, so allowed me to put some code together.
As nickpelling says, it needs some kind of entropy combo metric to get some worthwhile results.
But i don't even know what a worthwhile result would look like.


RE: [split] Verbose cipher? - RobGea - 19-05-2022

For reference the this is the original text, from which the above results were derived:
Code:
pchedal shdy ytechypchy otey qoteey qotal shedy yshdal dain okal daldy dain shek chcphhdy daloky opchedy peshol chep ar otchy sal lkeey sar ain chedy yshdain sheek cheoty eeokal chedy chckhy or orol okaiin eeal ot kar otar chal ychedy qokedy okedy qo



RE: [split] Verbose cipher? - Koen G - 19-05-2022

The question of h2 vs h1 is very complex. Both don't necessarily correspond in a "linear" way. What I mean is this:

You can think of what I've been doing as a branching path. I replace something first, based on desired h2 and h1 behavior. Then I replace the next thing based on what performs best next, and so on.

But maybe the best path is one that first sacrifices a bunch of h1. And then suddenly after x transformations, *poof* suddenly your values look like Voynichese.

Basically with a genetic algorithm approach, you lock yourself into certain initial conditions. Maybe if you just brute-force all possible combinations and pick the one which has h1 and h2 values closest to Voynichese, you might get better results. Maybe not, but it's possible.

The problem of word length and spaces remains the same though.


RE: [split] Verbose cipher? - MarcoP - 19-05-2022

(19-05-2022, 03:20 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.
Code:
Alphabet:    '+0123456789acdefghijklmnopqrstuxy  uses 35 characters

Thank you for the examples!

This is a suggestion for a specific experiment inspired by your results. Of course, there is a huge (infinite?) number of different scenarios. I may be misunderstanding something, so take what I write with a big grain of salt.

I think that treating space as a character is anachronistic. Instead, one could simply remove all spaces and start from there.

Since the alphabets you get have many symbols, one could tackle this as a verbose homophonic cipher: some plain-text characters are represented by more than one cipher sequence ("sequence" because the cipher is also verbose).
As a first experiment, I would exclude the idea that a single cipher symbol corresponds to something longer than a single plain-text character (BTW, this would probably result in lower entropy again).

For instance, the alphabet above could be interpreted as (top cipher symbols, bottom plain-text - dots only added for alignment):

Code:
'+0 1 2 3 456 7 8 9 acd e f g h i jkl m n o p q r s t u x y
.A. B C D .E. F G H .I. J K L M N .O. P Q R S T U V W X Y Z

Meaning that plain text 'O' can be encoded as any of j,k,l. Of course, a problem is that you have a huge number of mappings like this. I don't think that an homophonic cipher can be addressed by brute-force, I guess there is a literature about algorithms to tackle the problem.

For each verboseAlphabet+homophonicMapping combination, you decode the VMS and compare with candidate plain texts (where spaces were also removed). Rather than comparing entropies, you could directly compare bigram or trigram frequencies. I am afraid that, if alphabet sizes are different, comparing entropies does not really make sense.

Things would be somehow simpler if the verbose step produced an alphabet with the same size as the plain-text alphabet: in that case, entropies could be more significant. But Rene recently pointed out that similar entropies can result from very different character-frequency distributions: similar entropy is necessary but not at all sufficient for a decent decoding candidate.


RE: [split] Verbose cipher? - Koen G - 20-05-2022

(19-05-2022, 06:31 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.I think that treating space as a character is anachronistic.  

What do you mean by this, Marco? Does it also mean that they were unlikely to introduce extra spaces?

It feels to me like keeping space as a character in entropy calculations is a way to include word boundaries. E.g. the fact that "y" is often at the end of the word is a useful statistic that can be included. 

That said, I agree that removing spaces altogether is the best alternative. But I don't know if removing spaces and then applying transformations (across previous word boundaries) is a valid approach. I haven't tried this yet.


RE: [split] Verbose cipher? - Searcher - 20-05-2022

(20-05-2022, 07:24 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.That said, I agree that removing spaces altogether is the best alternative. But I don't know if removing spaces and then applying transformations (across previous word boundaries) is a valid approach. I haven't tried this yet.
But if applying transformations - at the beginning and then - removing spaces?