The Voynich Ninja
An explanation of the Voynich Manuscript text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: An explanation of the Voynich Manuscript text (/thread-1812.html)

Pages: 1 2 3 4 5 6 7


An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

Over the past few months I've been analysing the text, using standard statistical techniques, and now think I have a good idea how it was written.   I have posted You are not allowed to view links. Register or Login to view. describing my analysis, along with some attached files, on my "blog".

As there's a lot to read there, and you're no doubt ol daiin to know my You are not allowed to view links. Register or Login to view., I'm afraid they're going to disappoint many of you.   I have concluded that the text is almost certainly meaningless.   I have also worked out, in detail, the general method by which the text must have been generated.   Then, using this method, I have generated a You are not allowed to view links. Register or Login to view..   In You are not allowed to view links. Register or Login to view., I verified that this has all the important statistical properties of the original text.   The method leaves very little scope for hiding any meaning.

In brief, the text appears to have been generated using state transition tables.   At each state, a glyph is written.   The transition to the following state is then a weighted random choice, possibly decided by drawing a card or two from a shuffled pack, though I'm open-minded about the exact mechanism.   This might be a slow and tedious process, but it fits the data.   The state generation tables I have used are capable of generating 90% of the original text, but there's no reason that couldn't  be improved upon.

There are a few loose ends.   My method focuses on word generation.   Deciding paragraph breaks is still somewhat ad-hoc, and I didn't generate labels, though I think I have good reasons why they shouldn't present any problems.


RE: An explanation of the Voynich Manuscript text - Davidsch - 10-04-2017

Looks very good the generated text " Figure 4: Word length distribution for generated Voynich Manuscript".

Did you also publish the source code somewhere?


RE: An explanation of the Voynich Manuscript text - Anton - 10-04-2017

Hi Donald, and welcome to the forum!

What about the state transition probabilities? Do you have any idea on how they are pre-determined by the person who generates the text?


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

(10-04-2017, 04:04 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Hi Donald, and welcome to the forum!

What about the state transition probabilities? Do you have any idea on how they are pre-determined by the person who generates the text?

No, that's a historical question.   I arrived at them through reverse engineering.   The only clue I have is that they vary from one part of the manuscript to another, so after generating the first table (possibly by trial and error?), subsequent tables could be generated by modifying the transition probabilities in the previous table.   The You are not allowed to view links. Register or Login to view. shows a distribution from herbal through astronomical and text to biological, so I presume the text was generated in that order (or possibly reverse order).


RE: An explanation of the Voynich Manuscript text - nickpelling - 10-04-2017

(10-04-2017, 04:25 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.
(10-04-2017, 04:04 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Hi Donald, and welcome to the forum!

What about the state transition probabilities? Do you have any idea on how they are pre-determined by the person who generates the text?

No, that's a historical question.   I arrived at them through reverse engineering.   The only clue I have is that they vary from one part of the manuscript to another, so after generating the first table (possibly by trial and error?), subsequent tables could be generated by modifying the transition probabilities in the previous table.   The You are not allowed to view links. Register or Login to view. shows a distribution from herbal through astronomical and text to biological, so I presume the text was generated in that order (or possibly reverse order).

One of the main criticisms I had of Gordon Rugg's table-and-grille explanation was that it presupposed that a very CompSci mindset was behind it all, which always made what he put forward seem like a modern back projection, i.e. that's he would have done it had it been him but 500+ years earlier.

Which is, of course, nonsensical.

More recently, we've had Don collecting the stats for Voynichese into a relatively fat dictionary and saying that this alone is sufficient to explain Voynichese.

Which, of course, it isn't.

Are you really sure that being able to amass state table and PCA statistics is enough to definitively prove that Voynichese is meaningless as opposed to , say, shorthand and/or enciphered and/or visually confounded in some way?

I'd agree that what you've done would be a good support for a proof that Voynichese isn't trivially a straightforward language: but you're knocking at an open door there for me.

I suppose what's missing to my eyes is the structure of a workable proof that what we're looking at isn't just confounded in some cunning (but perhaps not actually very complex) way.

There's a difference between the material to form a proof from and an actual proof, and I don't yet see how you have made the transition from the former to the latter.


RE: An explanation of the Voynich Manuscript text - ReneZ - 10-04-2017

There are many interesting aspects to what you have done, and this can be used for some additional interesting experiments. It will be worth to come back to that later.

On the other hand...
While it may not seem obvious at first sight, I see that Nick caught on to it as well, and your approach is conceptually the same as Gordon Rugg's.
Your result is better than Gordon's in one respect, and worse in another.

You generate text that really looks very much like the Voynich text, much more so than that of Gordon.
On the other hand, Gordon presents a simple means how it could be generated (in theory), whereas you don't.

Both are methods that try to reverse engineer the Voynich text "as we know it", but the result is not an exact match.
The method (both yours and his) would require further targeted tweaking in order to match the missing bits.
These include:
- to make sure that m  appears predominantly at line ends (which is not yet the case)
- to make sure that f and p appear predominantly at first lines of paragraphs (which they don't yet).
- the special properties of the line-initial words ("line as a functional unit")

Having said that, I have a couple of questions.
1) Do I see correctly that you have different state transition probabilities for the different sections of the MS
2) Is each word started 'from scratch' or do the probabilities continue over word spaces.

The most interesting thing I find is that the Zipf law is followed so well just from the word generation based on state transition probabilities.

There's much more, but it will have to be later.....


RE: An explanation of the Voynich Manuscript text - Anton - 10-04-2017

(10-04-2017, 04:25 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.
(10-04-2017, 04:04 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Hi Donald, and welcome to the forum!

What about the state transition probabilities? Do you have any idea on how they are pre-determined by the person who generates the text?

No, that's a historical question.   I arrived at them through reverse engineering.   The only clue I have is that they vary from one part of the manuscript to another, so after generating the first table (possibly by trial and error?), subsequent tables could be generated by modifying the transition probabilities in the previous table.   The You are not allowed to view links. Register or Login to view. shows a distribution from herbal through astronomical and text to biological, so I presume the text was generated in that order (or possibly reverse order).


That, I think, is the key question, because specifying transition probabilities apriori, one would be able not only to generate a "Voynichese"-like text, but also to mimick a natural language (e.g. English), which any side observer (say, some extraterrestrial linguist) will have some difficulty to decide upon whether it's meaninful or meaningless. Bennett in his book gives some examples based on 3rd order letter correlation.

If I'm not mistaken, you imply that the Voynich author (or, rather, in this case, Voynich text-generator) effectively knew (not necessarily literally, but maybe through some "historical mechanism") the statistical characteristics of the text-to-be in advance. Moreover, you simplify his task by dealing not with individual glyphs, but with glyph groups.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

(10-04-2017, 05:09 PM)nickpelling Wrote: You are not allowed to view links. Register or Login to view.One of the main criticisms I had of Gordon Rugg's table-and-grille explanation was that it presupposed that a very CompSci mindset was behind it all, which always made what he put forward seem like a modern back projection, i.e. that's he would have done it had it been him but 500+ years earlier.

Which is, of course, nonsensical.

More recently, we've had Don collecting the stats for Voynichese into a relatively fat dictionary and saying that this alone is sufficient to explain Voynichese.

Which, of course, it isn't.

Are you really sure that being able to amass state table and PCA statistics is enough to definitively prove that Voynichese is meaningless as opposed to , say, shorthand and/or enciphered and/or visually confounded in some way?

I'd agree that what you've done would be a good support for a proof that Voynichese isn't trivially a straightforward language: but you're knocking at an open door there for me.

I suppose what's missing to my eyes is the structure of a workable proof that what we're looking at isn't just confounded in some cunning (but perhaps not actually very complex) way.

There's a difference between the material to form a proof from and an actual proof, and I don't yet see how you have made the transition from the former to the latter.

If we knew a priori that there was no meaningful information encoded in the text, and the mystery was how the text was generated, how would my solution look then?   You'd then have a variety of theories: Gordon Rugg's, Torsten Timm's, mine, a guy just getting drunk and writing down random words in a specific font, etc.   And you could test them to see which worked best, or at all.

Of course a priori we don't know that.   But in a century, no one has found any convincing evidence of meaningful information encoded in the text.   You might have good reasons to suspect it's there, based on what's known about medieval herbals or alchemical texts or whatever, but that's different from being sure it's there. 

My method can generates at least 90% of the words in the original text, in approximately the same relative frequencies found in the original text, accounts for differences seen in different parts of the manuscript, generates text which breaks Zipf's law in exactly the same way as the original text, and has a similar word length distribution.   Consider it at the very least as a null hypothesis.   If and when someone finds a better explanation - either one which extracts some overall meaning, or a simpler way of generating meaningless text with the same properties as the original - I'll accept that instead.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

(10-04-2017, 03:16 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.Looks very good the generated text " Figure 4: Word length distribution for generated Voynich Manuscript".

Did you also publish the source code somewhere?

I'm happy with publishing my source code, so I'll do that at some point in the near future.   I should warn you that it's written in Emblem, my own dialect of Lisp.   I plan on releasing Emblem for non-commercial use along with Full Metal Jacket (my visual dataflow language), but not yet.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

(10-04-2017, 06:40 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.If I'm not mistaken, you imply that the Voynich author (or, rather, in this case, Voynich text-generator) effectively knew (not necessarily literally, but maybe through some "historical mechanism") the statistical characteristics of the text-to-be in advance. Moreover, you simplify his task by dealing not with individual glyphs, but with glyph groups.

Not quite.   The author would just need to know the sort of words they wanted to generate, e.g. ykaiin, qokedy, qoteedy, okal, etc. but wouldn't have to care much about their weights, except to damp feedback.   Then,  draw up state tables either directly or through the intermediate stage of graphs, as I did (see figures 1 and 2 in  You are not allowed to view links. Register or Login to view.).    Beyond the core vocabulary, graphs soon become unwieldy and they'd have to use tables.   The word frequencies are just what they ended up with; it could have been otherwise and I'd have reverse-engineered different state transition tables.

I do deal with individual glyphs, though to some extent that depends how you define them.   For example, I tokenize "qokeedy daiin" as "qo+k+e+e+d+y d+a+iin".