The Voynich Ninja

Full Version: Voynich text generation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7
(19-03-2019, 02:17 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.In case people are curious about how I identified the plaintext for Don's ciphertext, it was a little bit of cryptography and a lot of philology. By philology I mean "relentless online research". Since I had mentioned 14th and 15th centuries and then Don clarified "within a century of the ms parchment dating", that pretty much gave away that it had to be an early 16th century text. So I found a very convenient list of them here:
You are not allowed to view links. Register or Login to view.
Eventually of course I arrived at Machiavelli and The Prince. Now the English Wikipedia only has links to English translations:
You are not allowed to view links. Register or Login to view.
But the Italian Wikipedia page for this work is more useful:
You are not allowed to view links. Register or Login to view.
And at the bottom of the page, it says, "Wikisource contiene il testo completo de Il Principe"
And from here it is a few more clicks to the text I was seeking:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
Whereupon one finds the text of the beginning of Il Principe.
And by connecting the repetition of "tutti" with the repetition of phrases at the beginning of Don's ciphertext, it is immediately clear that this has to be the plaintext.

I analysed Italian using precisely that text, here: You are not allowed to view links. Register or Login to view., and I got the idea of two letters per Voynichese word from You are not allowed to view links. Register or Login to view..
(19-03-2019, 09:41 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.
(19-03-2019, 02:17 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.In case people are curious about how I identified the plaintext for Don's ciphertext, it was a little bit of cryptography and a lot of philology. By philology I mean "relentless online research". Since I had mentioned 14th and 15th centuries and then Don clarified "within a century of the ms parchment dating", that pretty much gave away that it had to be an early 16th century text. So I found a very convenient list of them here:
You are not allowed to view links. Register or Login to view.
Eventually of course I arrived at Machiavelli and The Prince. Now the English Wikipedia only has links to English translations:
You are not allowed to view links. Register or Login to view.
But the Italian Wikipedia page for this work is more useful:
You are not allowed to view links. Register or Login to view.
And at the bottom of the page, it says, "Wikisource contiene il testo completo de Il Principe"
And from here it is a few more clicks to the text I was seeking:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
Whereupon one finds the text of the beginning of Il Principe.
And by connecting the repetition of "tutti" with the repetition of phrases at the beginning of Don's ciphertext, it is immediately clear that this has to be the plaintext.

I analysed Italian using precisely that text, here: You are not allowed to view links. Register or Login to view., and I got the idea of two letters per Voynichese word from You are not allowed to view links. Register or Login to view..

Great minds think alike  Big Grin
Earlier, examples of mechanisms (concentric wheels) for generating meaningless text were already given. I offer one more. Plain cubic dice with 6 facets.
If she cuts vertices, then we get a 12-polyhedron. On each face write a character (or combination of characters).
Cubes can be 3, 4, 5, +
The order of the characters in the word can be determined using the rule (for example, the color of the cubes).
Combinations of characters in a word (including forbidden combinations), as well as spaces (and even half-spaces)  can be guaranteed with characters (no characters) on the faces (it all depends on how you specify them).
Different frequency (probability) of symbols (combinations of symbols) is provided by a different area between the triangular (red color in the figure) and octagonal faces and the possible repeatability of the same symbols (combinations) on different cubes.
[attachment=2730]
I have been playing with a very flexible and humanly usable cipher (not a verbose homophonic substitution) that can produce many different outputs for the same plaintext (here: "errare humanum est") such as these three:

olShedy Shedy lor otar otchdy ol Shey tchy lr op eey chy dy
sair al Shedy dar otal ytchdy ol Shey tchy dor chkchary chy dy
dair al Sholy or otar ytchy dor Shey kchdy ol chkchy or chy dy


"perseuerare diabolicum" can be:

c[font=Eva]Thy dair cheeeody oldy Shedy dar otar qoky okchody keor chey ted daiin Sheol[/font]

(I am trying to optimize the keys to match Voynichese better but convergence is slow.)

An entire Rosetta stone would be needed to figure out what this means (famous motto, Latin):

oain kchaly kedy Sholy chkchy cKhy or keotal ol olSheykeees soraiin
All right, here is another one. This was also quickly generated, so it's not perfect, but it does illustrate a few patterns of interest.

It is natural language from a classical text and is very close to one-to-one substitution (which is difficult in Voynichese and makes it apparent that it's not genuine, but it has some of the same properties), so this is not a verbose cipher. It is not Latin, Italian, or German:

[attachment=2852]


I had the same problem as I had with the first sample. Even though they are created in quite different ways and are different languages, one specific thing about Voynichese was particularly hard to achieve in both the samples and I notice that René's sample shows the same divergence from Voynichese as these, so that caught my attention.

Something else I should mention. When I tried to get this as close to one-to-one substitution, as possible, I almost ran out of VMS characters.
To be honest I have no idea how you guys transformed the text. Nablator, you seem to use lots of nulls to evoke Voynichese structure? But I don't understand the rest.

JKP it's clear that your system attempts to stay closer to the source text, given the many "illegal" sequences. Still you seem to have done something to promote the desired position of o, y, gallows and i-clusters. Was this possible because of the source language or because of deviation from one-to-one substitution?
(14-04-2019, 01:26 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.To be honest I have no idea how you guys transformed the text. Nablator, you seem to use lots of nulls to evoke Voynichese structure? But I don't understand the rest.

I don't exactly have any nulls, only a lot of ways to encipher the same plaintext, some of them longer than others. The variability has a cost, because the choice of one possibility among many is information stored in the ciphertext that could be extracted from it - theoretically it could be used to store more meaningful information but I don't expect a human could do it so it is wasted: one possibility is selected randomly. The ciphertext:plaintext ratio can be improved with a better key, bringing it closer to 2.5. Also, by using less common combinations of glyphs more often, but then it does not match Voynichese very well.

It was one of my first attempts at optimization, using VMS glyph trigram counts only for scoring the result of encryption of a short sample of Latin text. I was pleasantly surprised to see some properties of the VMS naturally emerge in this experiment, without a dictionary of Voynichese: word lengths and apparent prefixes and agglutination look good. Smile
(02-04-2019, 02:12 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.I have been playing with a very flexible and humanly usable cipher (not a verbose homophonic substitution) that can produce many different outputs for the same plaintext (here: "errare humanum est") such as these three:

olShedy Shedy lor otar otchdy ol Shey tchy lr op eey chy dy
sair al Shedy dar otal ytchdy ol Shey tchy dor chkchary chy dy
dair al Sholy or otar ytchy dor Shey kchdy ol chkchy or chy dy


"perseuerare diabolicum" can be:

c[font=Eva]Thy dair cheeeody oldy Shedy dar otar qoky okchody keor chey ted daiin Sheol[/font]

(I am trying to optimize the keys to match Voynichese better but convergence is slow.)

An entire Rosetta stone would be needed to figure out what this means (famous motto, Latin):

oain kchaly kedy Sholy chkchy cKhy or keotal ol olSheykeees soraiin
Can you explain how it works exactly?
(14-04-2019, 08:14 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Can you explain how it works exactly?
I can but it's a work in progress. There is a lot to try and improve before I can present results that I feel confident about.
(14-04-2019, 01:26 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.To be honest I have no idea how you guys transformed the text. Nablator, you seem to use lots of nulls to evoke Voynichese structure? But I don't understand the rest.

JKP it's clear that your system attempts to stay closer to the source text, given the many "illegal" sequences. Still you seem to have done something to promote the desired position of o, y, gallows and i-clusters. Was this possible because of the source language or because of deviation from one-to-one substitution?

It's hard to get anything past you Koen. That is a very perceptive question.

I'll give you an honest answer. There is a natural language that inherently has some of the properties of the VMS text. I wasn't even going to attempt to try one-to-one substitution (or as close to it as I could get) unless I could find a language that was somewhat similar.

I'll also divulge that I did use one null character.
Pages: 1 2 3 4 5 6 7