The Voynich Ninja
An explanation of the Voynich Manuscript text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: An explanation of the Voynich Manuscript text (/thread-1812.html)

Pages: 1 2 3 4 5 6 7


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 10-04-2017

(10-04-2017, 06:33 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.There are many interesting aspects to what you have done, and this can be used for some additional interesting experiments. It will be worth to come back to that later.

On the other hand...
While it may not seem obvious at first sight, I see that Nick caught on to it as well, and your approach is conceptually the same as Gordon Rugg's.
Your result is better than Gordon's in one respect, and worse in another.

You generate text that really looks very much like the Voynich text, much more so than that of Gordon.
On the other hand, Gordon presents a simple means how it could be generated (in theory), whereas you don't.

Both are methods that try to reverse engineer the Voynich text "as we know it", but the result is not an exact match.
The method (both yours and his) would require further targeted tweaking in order to match the missing bits.
These include:
- to make sure that m  appears predominantly at line ends (which is not yet the case)
- to make sure that f and p appear predominantly at first lines of paragraphs (which they don't yet).
- the special properties of the line-initial words ("line as a functional unit")

Having said that, I have a couple of questions.
1) Do I see correctly that you have different state transition probabilities for the different sections of the MS
2) Is each word started 'from scratch' or do the probabilities continue over word spaces.

The most interesting thing I find is that the Zipf law is followed so well just from the word generation based on state transition probabilities.

There's much more, but it will have to be later.....

Although I knew of Gordon Rugg's theory, I began this work assuming that there was meaningful content in the text, and this was reinforced by the PCA of the pages, which matches the illustrations quite well.   I suspect Gordon Rugg's theory, with words as prefix+root+suffix instead of states based on individual glyphs, is equivalent to a special case of my theory.

I am concerned that my method for generating text would be significantly slower, but there might be a mathematically equivalent way of doing it faster.   I haven't thought of one, though.

I paid very little attention to the level above words, but I did add checks for word-initial p and f, and when they occur my text generator starts a new line.   I should implement this properly, as well as the other known the line and paragraph properties.

1) Yes, after running a cluster analysis, I divided the manuscript into 12 separate page clusters (4 herbal, 3 text, 3 bio, 1 pharma, and 1 astro) and each has its own state transition diagram.
2) Yes, each word starts from scratch.   If this disagrees with a known property of the text, it would be worth seeing whether several space states work better than one.


RE: An explanation of the Voynich Manuscript text - -JKP- - 10-04-2017

You used LISP (or a flavor of it). Cool! I don't have to learn a new language to comprehend it.   Smile

I'll take a look at your papers this evening.


RE: An explanation of the Voynich Manuscript text - Anton - 10-04-2017

I forgot to mention that, of course, gallows coverage should be fit into any "text-generation" proposal, while it's been fit to none of them - I assume that's because this phenomenon's statistical properties remain uninvestigated, which is largely due to the fact that gallows coverage is not resembled in any of the available transcriptions.


RE: An explanation of the Voynich Manuscript text - ReneZ - 11-04-2017

Let me add that I don't reject the possibility that the text is meaningless. I consider it possible, and there are some arguments in favour of it.

However, the simulations shown here do not demonstrate this. Let me try to explain why, by using a thought experiment. (This could actually be done in practice).

Take some known text, why not 'Don Quixote'. This is certainly a meaningful text.

Sort all words in descending frequency of occurrence. This produces a list.

Put next to this a list of all words in the Voynich MS, equally sorted in order of descending frequency.


Using the resulting table of word pairs as a translation table, translate Don Quixote word for word into Voynichese. This results in a text using 100% real Voynichese words, but that is is meaningful.

Now shuffle all the words in Don Quixote around. (This is easily done with a computer. With paper and a pair of scissors would take a considerable amount of time.)
The resulting text is certainly meaningless.

Again translate this into Voynichese using the same table as before.

The two texts in Voynichese that we obtained are looking extremely similar. Nobody would be able to tell whether one or the other is meaningful or not.
Both have exactly the same word length distribution and their 'Zipf graphs' are identical.

As a side note: neither of them would have the properties of Eva-f, p and m mentioned in my earlier post.

Anyway, it is clear that 'meaning' is defined by the order of words, not the 'method' how words are created.

An interesting test would be to check for repeating word sequences. This can be predicted with statistics, in principle, but also checked using software that some people have already available.


RE: An explanation of the Voynich Manuscript text - -JKP- - 11-04-2017

(10-04-2017, 09:16 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.In a 240 page manuscript of meaningless text, you're bound to find those sort of co-occurrences somewhere.   Also, the plants are poorly drawn and few have been positively identified.

...


No, they are not. The root on You are not allowed to view links. Register or Login to view. is the most accurate drawing of a rhizome I have seen in any manuscript dating before 1500. It includes the leaf scars in the correct proportion and spiral around the shoot. It includes dots to show the different texture on each scar in a way that is accurate. It includes the leading edge of the rhizome (the part that creates the next year's shoot) and has made it correctly scaly in contrast to the old part of the rhizome with the leaf scars.

The parts the illustrator cared about, or which are significant for that specific plant, are accurate. It's also the most accurate drawing of Cuscuta for its time that I've encountered so far. Most illustrators drew a bunch of scraggly lines. They didn't indicate the parasitic nature of the plant in any meaningful way (the way it intrudes into the roots), nor did they show the individual flowers as was done in the VMS.

Tragopogon is also very accurate and completely recognizable. If people don't recognize it, it's because they don't know the plant.

The reason some can't be identified is because they lack a detail or two that is needed to distinguish between one species or another but it wouldn't matter to the person using it because they would know beforehand which one it was. Take Viola as an example. There are three very similar species that could be represented by this drawing, so there's not enough detail to know which one, but it is Viola and it's quite well drawn.

The reason others can't be identified is because there are often two or three species with the same characteristics (like a basal whorl and parallel veins) but again, if the animal nibbling on the leaf is mnemonic and the person using the manuscript knew what it meant (or created the mnemonic in the first place), then they will have no trouble recognizing the plant.


If you understand plants, together with the VMS way of doing things, you gain respect for the plant drawings.



I'm pretty confident that the plant drawings are meaningful. That doesn't guarantee the text is meaningful because it may have been added by someone else for a different purpose, or at a later time, but one can't use the "crudeness" of the plant drawings (they are not crude, they're pretty good) as an argument for anything about the text.


RE: An explanation of the Voynich Manuscript text - Torsten - 19-04-2017

(11-04-2017, 06:03 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Let me add that I don't reject the possibility that the text is meaningless. I consider it possible, and there are some arguments in favour of it.

However, the simulations shown here do not demonstrate this. Let me try to explain why, by using a thought experiment. (This could actually be done in practice).

Take some known text, why not 'Don Quixote'. This is certainly a meaningful text.

Sort all words in descending frequency of occurrence. This produces a list.

Put next to this a list of all words in the Voynich MS, equally sorted in order of descending frequency.


Using the resulting table of word pairs as a translation table, translate Don Quixote word for word into Voynichese. This results in a text using 100% real Voynichese words, but that is is meaningful.

Now shuffle all the words in Don Quixote around. (This is easily done with a computer. With paper and a pair of scissors would take a considerable amount of time.)
The resulting text is certainly meaningless.

Again translate this into Voynichese using the same table as before.

The two texts in Voynichese that we obtained are looking extremely similar. Nobody would be able to tell whether one or the other is meaningful or not.
Both have exactly the same word length distribution and their 'Zipf graphs' are identical.

In the first case the word order of the original text is unchanged. Since repeated word sequences are typical for any language it is possible to find repeated sequences here. In the english variant of Don Quixote (see You are not allowed to view links. Register or Login to view.) it is for instance possible to find phrases like "if I had" (24 times), "if we had" (7 times), "it may be" (61 times), "what you say" (7 times) or "the same thing" (14 times).  By replacing words with different ones the word order remains unchanged. Since it is very unlikely to find shuffled variants like "thing the same" for phrases in a natural language the same would be true for the text using replaced words.

For the shuffled text repeated phrases only occur coincidentally. Therefore repeated phrases would be rare occur. Moreover each order for some words has the same chance. Therefore it is expected that repeated words will also occur in different word order. The existence of repeated phrases with a fix word order is therefore a feature that allows us to distinguish between both types of texts. That the order of words matters for the first text indicates that the words transports some meaning.

In the case of the VMS repeated phrases are rare [see You are not allowed to view links. Register or Login to view.]. Words used together multiply times also occur in different order. In this way the text of the VMS is similar to the text with shuffled words. But the chance that two words occur together multiple times is increased if the words are the same or similar to each other. See for instance phrases like [chedy qokeey qokeey], [qokeey qokeey chedy] and [qokeey chedy qokeey] [see You are not allowed to view links. Register or Login to view.]. Therefore the order of words in the VMS is not purely random. Another observation is that the word next to a word ending with [y] has a higher chance to start with [q]. It is also possible to predict to some extent the position within a paragraph or within a line for words starting or ending ending with with glyphs typical for this position. Even if a word did not depend on the previous words as expected for a natural language the words in the VMS depend to some extend on there context and on there position within a page or line. Therefore the text of the VMS is different to both types of text mentioned by Rene. There is no word order as expected for a text using natural language but the position of words is also not random.


RE: An explanation of the Voynich Manuscript text - -JKP- - 19-04-2017

Repeated sequences can be explained in a number of ways. Some of the more obvious ones have been mentioned on the forum and on blogs (poetry, emphasis, exaggeration, etc.).

But repeated sequences can also be explained by manipulation of spaces, and I haven't seen this discussed at any length... most of the statistical and linguistic studies of the VMS retain the spaces, so consider this:

     Many a man manipulates a management assessment.
     Man ya man man ipula tesa man age men tass ess ment.


First, note that similarity to Voynichese is increased in the second example, without shuffling a single letter.

Then note that in the second example, not only does "man" occur three times in a short space, but tokens like "ment", will likely occur frequently in any manuscript because they are common suffixes. Also note that the consonant-vowel balance changes, with twice as many vowels at the end if words are broken this way.


This may be an extreme example (although poetry often has these properties), but it should be enough to illustrate that there may be a rule set that breaks words at specific points to put certain glyphs at the beginning and, to balance it (since it's hard to manipulate both the beginnings and ends of short words at the same time), nulls at the ends (the ends of Vwords are repetitious in the extreme with an unusually high number of EVA-y).


I'm not arguing that Voynichese is natural language split into pieces. It's possible, but the text is too rule-based and constrained for it to seem likely, but I do think that statistical modeling and breakdowns will have to consider that spaces may be selectively chosen (or be a result of the glyph selection process), thus resulting in certain patterns that seem repetitious for reasons other than their actual method of construction.


RE: An explanation of the Voynich Manuscript text - Torsten - 21-04-2017

(10-04-2017, 02:50 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.I have also worked out, in detail, the general method by which the text must have been generated. 
In brief, the text appears to have been generated using state transition tables. 

Letters are highly predictable within words for the VMS. Therefore the idea to describe words using transition tables or a grammar is not far fetched (see You are not allowed to view links. Register or Login to view.). But that you can describe the words this way doesn't mean that the author of the VMS must have been generated the words this way. 
In the case of the VMS the exceptions are important. For every rule you can define you will also find some exceptions. An example is the glyph [q]. In most cases [q] is the first letter of a word and in most cases [q] is followed by the letter [o]. But this is not alway the case. There are 22 words starting with [oq] and there are 66 words using [qe] instead of [qo] and 6 words using [qa] instead of [qo]. This way beside the word [qokar] (157 times) you can also find the words [qekar] (2 times) and [qakar] (1 times). The problem with this exceptions is that they occur in a systematic way since there is an increased chance that you can find two or more exceptions of the same kind within a short distance on the same page. There are for instance only 22 words with [oq] for the whole manuscript but in line f114r.P2.41 there is a sequence [oqotoiiin oqoeeosain]. There are also only three instances of the word [qeey] but two of this three instances occur on page f112r with only a distance of two lines.

Your test about word order (see You are not allowed to view links. Register or Login to view.) only demonstrates that a word doesn't depend on the previous word. Unfortunately this observation alone is not enough to allow the conclusion "that words are output randomly". It is also possible that a word depends on his position within a line, that a word depends on a word in the previous line or that  a part of a word depends on a part of the previous word. For the VMS it is possible to give examples for each of them: Words ending with [m] are most likely found as last word of a line. There is an increased chance that you can find the same word in the next line in the same position (see the word [daiin] on You are not allowed to view links. Register or Login to view.). Last but not least there is also an increased chance that a word ending with [y] is followed by a word starting with [q].


RE: An explanation of the Voynich Manuscript text - Davidsch - 21-04-2017

In any language letters are predictable. In a sentence words are predictable if you look at the SVO or VSO etc. word order and grammar.
In Voynichese exactly the same: letters are predictable and word sequences as well.

There are some differences.

Firstly the letters which are predictable must follow predictable paths for all letters, just a few, or none. 
There is no language where that happens, however in Voynichese it happens that there are many letters that behave unpredictable, or as you wrote "have exceptions".

Also the word sequences are predictable. However, this is only valid for some words. There still are a huge amount of words that are unpredictable.
That happens when for example it's a poem, a song or a scientific text on a specific subject in normal languages.

Lastly the composition of words is monotonic (or monotone) and that is also very strange. 
Even if it is a ciphered text, they still have to follow cipher rules which would still make the composition of words vary more than it does now.  
The only explanation I have for that is that a) the cipher works in such a way that the result is a similar composition of words,  b) the cipher was wrongly composed and that is reason it can not be deciphered,  or c) those words are mainly full of nulls and the repeated letters are just nulls, or d) the letters are highly abbreviated, for example o=oleo, i=item etc. but then the spaces make no sense


RE: An explanation of the Voynich Manuscript text - Torsten - 21-04-2017

(21-04-2017, 12:35 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.In any language letters are predictable. In a sentence words are predictable if you look at the SVO or VSO etc. word order and grammar.
In Voynichese exactly the same: letters are predictable and word sequences as well.


For letters you are arguing against something never said. There was simply no thesis that letters are not predictable.  

Your statement about the word order is obviously wrong. Sequences of repeated words using the same word order all the times are missing for the VMS. Since they are missing there is no way to argue that the order of words is predictable.