The Voynich Ninja
An explanation of the Voynich Manuscript text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: An explanation of the Voynich Manuscript text (/thread-1812.html)

Pages: 1 2 3 4 5 6 7


RE: An explanation of the Voynich Manuscript text - Emma May Smith - 23-04-2017

(23-04-2017, 09:49 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.
(23-04-2017, 09:26 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.You haven't provided evidence, even you don't believe that. You've simply provided a way to generate a very similar text. Just because you can make a text in that way does not mean the author did. This very same discussion was had with Rugg's theory. He beat you to the punch by over a decade, and I don't see what you've added to his theory.

No, I have provided evidence.   One of the problems I have is that most people don't know what random looks like, and attach meaning to coincidences.

My theory is completely different from Gordon Rugg's.   Did you read and understand my blog articles?

Yes, I read them. You modeled the Voynich text. You didn't explain it.


RE: An explanation of the Voynich Manuscript text - nickpelling - 23-04-2017

Looking more closely at your pages, does it not concern you that your clustering may be broken?

Many previous clustering studies have found a strong clustering effect between recto and verso sides of the same folio (with the exception of the nine rosette foldout folio, as Rene Z reminded me recently), yet this is apparently not as noticeable in your clustering, as far as I can see.

Clustering via k-means or any of the multitude of funky variants it spawned is only as good as the dimensions of data that go in, which suggests to me that you may not have got this step quite right. Just sayin'.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 23-04-2017

(23-04-2017, 10:58 PM)nickpelling Wrote: You are not allowed to view links. Register or Login to view.Looking more closely at your pages, does it not concern you that your clustering may be broken?

Many previous clustering studies have found a strong clustering effect between recto and verso sides of the same folio (with the exception of the nine rosette foldout folio, as Rene Z reminded me recently), yet this is apparently not as noticeable in your clustering, as far as I can see.

Clustering via k-means or any of the multitude of funky variants it spawned is only as good as the dimensions of data that go in, which suggests to me that you may not have got this step quite right. Just sayin'.

Are you referring to You are not allowed to view links. Register or Login to view.?   It didn't use PCA.    It looks like it used an ad-hoc approach.

I think my algorithms are correct.   I used three standard ones, PCA, k-means and minimum spanning tree.   The PCA plot in You are not allowed to view links. Register or Login to view. agrees with the one on Sarah Goslee's You are not allowed to view links. Register or Login to view..    I then considered separately the herbal, biological, and text pages, and modified the herbal clusters as I explained in You are not allowed to view links. Register or Login to view..    For clustering, I used the first two principal components.

In any case, clustering is very much a black art.   It might be worth checking whether the folio mismatches are with nearby points assigned to different clusters (the so-called Copenhagen-Malmo problem), in which case there would be a good case for reassigning some points to different clusters.


RE: An explanation of the Voynich Manuscript text - Torsten - 24-04-2017

(23-04-2017, 04:38 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.It's based on the evidence that it's random.   That of itself doesn't prove it's meaningless.   Any good cipher nowadays is random.   However, I'm highly skeptical of it having meaning as I can't imagine anyone back then encrypting it to make it look random, then applying a different process to make it look like an unknown language.

I think we are talking about different features. It is indeed not possible to say which words will follow each other (see You are not allowed to view links. Register or Login to view.). In this way words in the VMS are not used like words in a natural language. On the other hand we have effects like the one described by Currier: "Words ending in the [y] sort of symbol, which is very frequent, are followed about four times as often by words beginning with [qo]" (You are not allowed to view links. Register or Login to view.). We have also the effects like words starting with a certain glyph or ending with a certain glyph are preferred at the start or end of a line. This features would fit with a meaningless text since it is possible to describe patterns for the usage of glyphs but not for the usage of words.


Quote:I don't think there's any conflict between our views that the Voynich Manuscript changes over time, as the manuscript was written, from Currier A to Currier B, or that my evidence in any way conflicts with that found by Montemurro and Zanette.


You assumed different transition tables for each part of the manuscript. How can your transition table or tables result in a steady change over time? How do you simulate the effect that similar words do co-occur on the same pages? Did you suggest that a different transition table was used for each page?


Quote:I don't have access to the full text, but Schinner's abstract states "The results significantly tighten the boundaries for possible interpretations; they suggest that the text has been generated by a stochastic process rather than by encoding or encryption of language."   This is what I've been saying.


You say more then that the text has been generated by a stochastic process rather than by encoding or encryption of language. You say that the text has been "generated randomly using state transition diagrams" (see You are not allowed to view links. Register or Login to view.). If this is the case why the position of a word has some effect to the glyphs used for hat word?


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 24-04-2017

(24-04-2017, 12:23 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.I think we are talking about different features. It is indeed not possible to say which words will follow each other (see You are not allowed to view links. Register or Login to view.). In this way words in the VMS are not used like words in a natural language. On the other hand we have effects like the one described by Currier: "Words ending in the [y] sort of symbol, which is very frequent, are followed about four times as often by words beginning with [qo]" (You are not allowed to view links. Register or Login to view.). We have also the effects like words starting with a certain glyph or ending with a certain glyph are preferred at the start or end of a line. This features would fit with a meaningless text since it is possible to describe patterns for the usage of glyphs but not for the usage of words.

Yes, you're right about y-qo occurring more often than you'd expect by chance, so I've updated my page on You are not allowed to view links. Register or Login to view.. That was drowned out in the noise.   I think it could still be handled by a state machine, with transitions from the final glyph state to the initial glyph state.


Quote:You assumed different transition tables for each part of the manuscript. How can your transition table or tables result in a steady change over time? How do you simulate the effect that similar words do co-occur on the same pages? Did you suggest that a different transition table was used for each page?

The probabilities in the tables change by small amounts from one page cluster to the nearest one.


Quote:You say more then that the text has been generated by a stochastic process rather than by encoding or encryption of language. You say that the text has been "generated randomly using state transition diagrams" (see You are not allowed to view links. Register or Login to view.). If this is the case why the position of a word has some effect to the glyphs used for hat word?

I suggested solving this by a preprocessing step.


RE: An explanation of the Voynich Manuscript text - Torsten - 24-04-2017

(24-04-2017, 01:32 AM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.Yes, you're right about y-qo occurring more often than you'd expect by chance, so I've updated my page on You are not allowed to view links. Register or Login to view.. That was drowned out in the noise.   I think it could still be handled by a state machine, with transitions from the final glyph state to the initial glyph state.

The y-qo pattern is only an example. In his paper from 2012 You are not allowed to view links. Register or Login to view. has for instance demonstrated three effects: 1. The first word in a line of a line is longer than average, 2. The second word is shorter, 3. Over the course of the line, the average word length drops. It is possible to name numerous other patterns for the VMS.


Quote:The probabilities in the tables change by small amounts from one page cluster to the nearest one.

Sorry, but small changes for page clusters didn't explain the differences between individual pages. It is even possible to describe differences between the front and back side of a single sheet. Words common on one page can be rare on the next page. See for instance the differences I have described for the first three quires You are not allowed to view links. Register or Login to view.


Quote:I suggested solving this by a preprocessing step.


For which reason someone should add position based rules to randomly generated words? Such a step would mean additional work has to be done. I didn't see any advantage for this effort if the text has no meaning anyway.

You can simulate the words of the VMS with a state transition diagram. This is no surprise since it is always possible to build a state transition diagram for a given set of words. But this didn't mean that a state transition diagram is a effective method to generate a text if no computer is available. As far as I can see the process you describe is a more detailed variant of Gordon Ruggs approach. The problem is that much more effort is needed to make your method work. You have to role the dices for every glyph with probabilities depending on the previous glyphs. Moreover the probabilities must change over time. Gordon Rugg has simulated this with different tables you simulate it with different transition diagrams. But this is not enough since you both have to add additional steps to your methods to explain that the text corresponds to its container. In my eyes your method is not effective enough to generate a huge amount of text manually.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 24-04-2017

(24-04-2017, 10:02 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.The y-qo pattern is only an example.

There might well be other final-initial glyph frequencies significantly above or below chance.   This is worth checking.   I can accommodate it by having a transition from final state to initial state.   This might make the Gamma distribution I observed for word pairs moot.

Quote:In his paper from 2012 You are not allowed to view links. Register or Login to view. has for instance demonstrated three effects: 1. The first word in a line of a line is longer than average, 2. The second word is shorter, 3. Over the course of the line, the average word length drops. It is possible to name numerous other patterns for the VMS.

I'll have to examine the Vogt result in more depth.   It's harder for me to explain, and I don't have any ideas at present.   If there's anything else you think I'm unaware of, please let me know.

Quote:Sorry, but small changes for page clusters didn't explain the differences between individual pages. It is even possible to describe differences between the front and back side of a single sheet. Words common on one page can be rare on the next page. See for instance the differences I have described for the first three quires You are not allowed to view links. Register or Login to view.

Different sides of the same folio often lie in different clusters, so different word frequencies isn't a problem.

Quote:
Quote:I suggested solving this by a preprocessing step.

I meant postprocessing step.

Quote:You can simulate the words of the VMS with a state transition diagram. This is no surprise since it is always possible to build a state transition diagram for a given set of words. But this didn't mean that a state transition diagram is a effective method to generate a text if no computer is available. As far as I can see the process you describe is a more detailed variant of Gordon Ruggs approach. The problem is that much more effort is needed to make your method work. You have to role the dices for every glyph with probabilities depending on the previous glyphs. Moreover the probabilities must change over time. Gordon Rugg has simulated this with different tables you simulate it with different transition diagrams. But this is not enough since you both have to add additional steps to your methods to explain that the text corresponds to its container. In my eyes your method is not effective enough to generate a huge amount of text manually.

Generating text by hand the way using state transition tables in the obvious way would be time-consuming, even slower than Gordon Rugg's method.   But I haven't put much effort into finding a fast way of doing it, and my words are more realistic.

As far as I know, Gordon Rugg was the first to suggest a well-defined method by which meaningless text could be generated which resembles the Voynich text.   His approach is probably a subset of mine, but they are not equivalent.   Mine is more general, and I arrived at it from a different starting point.

So there's still a lot of work to be done to make a watertight theory.


RE: An explanation of the Voynich Manuscript text - Torsten - 24-04-2017

(24-04-2017, 07:01 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.If there's anything else you think I'm unaware of, please let me know.

You are not allowed to view links. Register or Login to view. describes vertical patterns for the VMS. Emma May Smith has described an example of this kind You are not allowed to view links. Register or Login to view.. The word [otaiin] is used on page You are not allowed to view links. Register or Login to view. six times. Each time as either the third or second to last word in the line. Another example for this vertical pattern is the usage of [chor] on page You are not allowed to view links. Register or Login to view..

Quote:Different sides of the same folio often lie in different clusters, so different word frequencies isn't a problem.

On every page something new is happening. All you have to do is to search for a rarely used glyph combinations. You will always find hard to explain patterns for them. See for instance the usage of [ll] on page You are not allowed to view links. Register or Login to view..

The glyph sequence [on] only exists 5 times within the VMS. But three of this words occur on page You are not allowed to view links. Register or Login to view.. 

Normally the [m]-glyph is used as last glyph in a line. But not on page You are not allowed to view links. Register or Login to view.. On this page even two paragraph initial words end with [m].


Quote:Generating text by hand the way using state transition tables in the obvious way would be time-consuming, even slower than Gordon Rugg's method.   But I haven't put much effort into finding a fast way of doing it, and my words are more realistic.

As far as I know, Gordon Rugg was the first to suggest a well-defined method by which meaningless text could be generated which resembles the Voynich text.   His approach is probably a subset of mine, but they are not equivalent.   Mine is more general, and I arrived at it from a different starting point.

So there's still a lot of work to be done to make a watertight theory.

That it is possible to describe the words in the VMS using your state transition table is indeed an interesting observation. But by describing the words in the VMS you didn't have described the manuscript. Anyway, it is one thing to describe something and another to give a efficient method to reproduce it.


RE: An explanation of the Voynich Manuscript text - DonaldFisk - 25-04-2017

(24-04-2017, 11:34 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2017, 07:01 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.If there's anything else you think I'm unaware of, please let me know.

You are not allowed to view links. Register or Login to view. describes vertical patterns for the VMS. Emma May Smith has described an example of this kind You are not allowed to view links. Register or Login to view.. The word [otaiin] is used on page You are not allowed to view links. Register or Login to view. six times. Each time as either the third or second to last word in the line. Another example for this vertical pattern is the usage of [chor] on page You are not allowed to view links. Register or Login to view..

Quote:Different sides of the same folio often lie in different clusters, so different word frequencies isn't a problem.

On every page something new is happening. All you have to do is to search for a rarely used glyph combinations. You will always find hard to explain patterns for them. See for instance the usage of [ll] on page You are not allowed to view links. Register or Login to view..

The glyph sequence [on] only exists 5 times within the VMS. But three of this words occur on page You are not allowed to view links. Register or Login to view.. 

Normally the [m]-glyph is used as last glyph in a line. But not on page You are not allowed to view links. Register or Login to view.. On this page even two paragraph initial words end with [m].

These look like coincidences to me.   You have about 240 pages of text.   Suppose for the sake of argument the words are random.   Then you'll find places where you get the same word repeated several times, or in the same line position, or similar words close together.   But (since it's random) claiming significance in this is like seeing faces in clouds, or hearing voices in static.    People are predisposed to spotting patterns.

There are indeed some real patterns, which I didn't know about, and haven't accounted for.   -y qo- suggests that there might be glyph pairs which co-occur more or less often than chance.   My state transition model can handle that.   The Vogt results are more problematic.

Quote:That it is possible to describe the words in the VMS using your state transition table is indeed an interesting observation. But by describing the words in the VMS you didn't have described the manuscript. Anyway, it is one thing to describe something and another to give a efficient method to reproduce it.

So, not entirely negative.

What I hope is that my analysis is useful, even for those who don't accept my conclusions, which may still change when I see evidence that they should.

Having worked on this for about three months, I'm going to take a break.   I'll return to it later though, as there's a lot of work still to be done.   During the work on the Voynich Manuscript, I've been writing and testing various statistical and text-processing algorithms, which are generally useful.   I have other things to do now, such as to resume work on my programming language/IDE, You are not allowed to view links. Register or Login to view..


RE: An explanation of the Voynich Manuscript text - ReneZ - 25-04-2017

(23-04-2017, 10:08 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.I carried out this test, and confirmed that words in the manuscript are independent of the previous word (see You are not allowed to view links. Register or Login to view.).   The mean frequency of word pairs was very close to the expected probability (the product of the probabilities of the two words considered separately), but the frequency's variance was very close to the square of the mean (i.e. a Gamma distribution, in my generated text it was a Poisson distribution), suggesting that the mechanism I initially suggested for deciding on transition paths, and only that, was wrong.   As far as I know, no one had spotted this before.

I'm sorry but this is completely inconclusive.

Your figure 1 shows a strong deviation from randomness.
Still, also that is not conclusive in the other direction.

What completely lacks is evidence (metrics) for:
- what should be the statistical behaviour of a meaningful text
- how much variation there would be in this

The analysis is based on the assumption that every word type in the Voynich MS should consistently represent the same word in some plain text. This is not at all certain.
If the Voynich MS text includes null characters, all word combination statistics are completely thrown off.

There was a thread here in the voynich.ninja where there was a strong indication that the repeating sequence statistics in the MS are not that different from a known plain text.
(Of course also that is not conclusive for the same reason: "not that different" is not sufficiently defined).

My summary is:
- there are indications that the text could be meaningful
- there are indications that the text could be meaningless
Both are still inconclusive.
Some papers concentrate more on one or the other, but other papers do state both.