The Voynich Ninja

Full Version: A key to understand the VMS
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(05-01-2017, 12:39 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.What do you make of the fact that words beginning in q are almost never found in labels?  That suggests that words beginning in qok- and words beginning in qot- are in fact related in some way.  But then the different ok:qok and ot:qot frequency ratios suggests an important difference.

@Sam G

This a very important statement by Sam as I believe most labels would be nouns, therefore this property of, "q", rarely occurring in labels could indicate a pronoun or adjective and that is why most labels don't use, "q".
Hi Torsten, just looked at your 2016 paper. What interests me are the tables showing the likelihood that same or similar words occur nearby. You have sampled four different language for the tables you give in the paper: Arabic, Latin, English, and German. You also assert that the patterns seen in the Voynich text are not language like

However, it seems to me that the four languages aren't alike enough to give a clear indication of what a natural language should look like in these tables. Although Latin and English are similar, Arabic and German have quite different patterns.

So, you've sample four languages and found three distinct patterns, which is very shaky ground for insisting that the Voynich patterns are not language like. Indeed, German is so flat in these likelihood tables that it seems the most unlanguage like of the lot!
Quote:Have you produced any text that has this property?  I don't see why auto-copying should produce it.

I have written an app for doing so [see You are not allowed to view links. Register or Login to view.]. The sourcecode for this app is available via Github [see You are not allowed to view links. Register or Login to view.].

The entropy values for the text generated with this app are comparable with entropy values for the VMS:
       Currier A    Currier B     App 
H0   4.46          4.46           4.39
H1   3.82          3.88           3.81
H2   2.11          2.01           2.21

I only say that it is possible to explain the VMS with the autocopy hypotheses.

Quote:1) Asserting, without any evidence, that it cannot be a property of a natural language text

First of all, that it a text using natural language is your hypothesis. Therefore it is on you to demonstrate evidence in favor of your hypothesis. Sorry, but this is the way science works. It is a common mistake to assume a starting hypothesis while trying to interpret an undeciphered script. The danger this way is that every characteristic of the script will be interpreted with this starting hypothesis in mind. Because of this reason you should search for evidence for your starting hypotheses.

Secondly, the weak word order for the VMS alone is evidence against a natural language [see You are not allowed to view links. Register or Login to view.].

"Only a few repetitive phrases can be found. There are only 35 word sequences which use at least three words and appear at least three times. Only for five of these sequences is the word order unchanged for the whole manuscript, whereas for 30 out of 35 phrases the word order does change." [You are not allowed to view links. Register or Login to view.].

Quote:2) Asserting, without any evidence, that auto-copying would produce that property

"In 66 out of 72 cases (91%) at least one similar word was found within a maximum distance of three lines and a maximum edit distance of three. Furthermore, if near is defined as both glyph groups must be used one after another or in two consecutive lines one above the other, the result is still interesting. In 62 out of 140 cases (44%) and in 25 out of 72 cases (35%) a similar glyph group can be found for both samples. In other words, similar glyph groups can be found above each other twice as often as they can be found side by side." [You are not allowed to view links. Register or Login to view.].

"The words in the VMS build a network of words similar to each other. Therefore it is no surprise that a larger number of similar words exist for each word in the VMS. However, they occur near to each other and they occur with comparable frequencies. ... A better explanation is that since similar words do co-occur throughout the text, the spelling variations for a frequently used word also occur more often. In other words, for the VMS, the observed word frequencies are a result of the fact that similar words do co-occur throughout the text" [You are not allowed to view links. Register or Login to view.].

Moreover, it is possible to order the types by their similarities to build a multidimensional grid containing all word types which occur at least four times [You are not allowed to view links. Register or Login to view.]. See the network graph for the VMS compared to that of the arabic You are not allowed to view links. Register or Login to view.:

[attachment=1117] [attachment=1118]

An enlarged part of the graph for the Quran reveals that typical for the arabic language are multiple smaller networks with two, three or up to 20 words:

[attachment=1119]
(04-02-2017, 09:25 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
Quote:1) Asserting, without any evidence, that it cannot be a property of a natural language text

First of all, that it a text using natural language is your hypothesis. Therefore it is on you to demonstrate evidence in favor of your hypothesis. Sorry, but this is the way science works. It is a common mistake to assume a starting hypothesis while trying to interpret an undeciphered script. The danger this way is that every characteristic of the script will be interpreted with this starting hypothesis in mind. Because of this reason you should search for evidence for your starting hypotheses.

Physician heal thyself!
Hello Emma,

Quote:So, you've sample four languages and found three distinct patterns, which is very shaky ground for insisting that the Voynich patterns are not language like. Indeed, German is so flat in these likelihood tables that it seems the most unlanguage like of the lot!

I have compared the co-occurence pattern found for the VMS with some texts in natural languages. Therefore you can't use this test to determine if some of this texts are language like or not. In fact the text in Latin was a poem and the arabic text also is using verses. Therefore the data demonstrates that the German text was just an ordinary text and not a poem. 
Well, great, so your assertion that the pattern is not language like is based on nothing?
Hello Emma,

Quote:Physician heal thyself!

My starting hypotheses is that the VMS was created by a human mind. Beside that I don't care if the text has meaning or not or if it contains language or a cipher or a You are not allowed to view links. Register or Login to view.. 

In fact I have used the methodic I have learned for the You are not allowed to view links. Register or Login to view.. Therefore in some way I had the Phaistos disc in mind while analyzing the VMS. The Phaistos disc only contains 241 signs and 45 different signs. But even this short text did contain repeated phrases and multiple corrections. 

By the way for the Phaistos disc I come to the conclusion that the disc contains a text in human language and that this language is comparable to You are not allowed to view links. Register or Login to view.. The word length and the number of  different signs on the Phaistos Disc point to a syllabic script. My first test for the VMS was to check the word length and number of different glyphs. The number of 20-30 different glyphs point to alphabetic script whereas the average word length of 5.5 point to a syllabic script. Moreover the length of the words is equally distributed around the arithmetical mean [see You are not allowed to view links. Register or Login to view.]. Such a distribution is unusual because natural languages tend to make frequent use of short words. Typical for a natural language is therefore an asymmetric distribution [see You are not allowed to view links. Register or Login to view. p. 2].
Hello Emma,

Quote:Well, great, so your assertion that the pattern is not language like is based on nothing?

With "Therefore you can't use this text to determine if some of this texts are language like or not." I mean that there is a difference if you search a feature typical for languages in the VMS or if you search for a feature typical for the VMS in languages. My test shows that the compared texts in Latin, English, German and Arabic are not like the text of the VMS. This way I have compared a feature typical for the VMS. In my eyes this is more then nothing. 

At least it is more then Montemurro and Zanette did. They used features typical for language: "We first apply methods from information theory that identify content-bearing words without any prior knowledge of the underlying language of the text under analysis. Then, we consider putative semantic relationships between the most informative words by analysing their patterns of co-occurrence along the text" [You are not allowed to view links. Register or Login to view.]. This way they started with the assumption that the VMS contains language. While analyzing the "100 most frequently used words within the VMS" Montemurro and Zanette also come to the conclusion that "Words that are related by their semantic contents tend to co-occur along the text." [You are not allowed to view links. Register or Login to view.]. But even for there 100 word types they found something like a network for the VMS:

 [Image: journal.pone.0066344.g002]

Also the shift from Currier A to Currier B can be found in there paper:
[Image: journal.pone.0066344.g004]
Since they did not distinguish between herbal pages in Currier A and B there is an additional link from the astronomical section to the herbal section.

Since the goal of Montemurro and Zanette was to demonstrate that the VMS is language like they interpreted there result as confirmation of there starting hypothesis. They didn't check if the co-occurence pattern also holds for all words in the VMS and they didn't check if it is common for a language that more then the most frequently used words are related by their semantic contents: "The strongest links between the different sections as determined by the co-occurrence of the most informative words." [Montemurro 2013]. I would expect that the most frequently used words in a language are words like "and" or "the" and that this words are equally distributed within a text. In my eyes it is there starting hypotheses that prevents them seeing that there data demonstrate that the VMS is different from language.
(04-02-2017, 10:15 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.Hello Emma,

Quote:Well, great, so your assertion that the pattern is not language like is based on nothing?

With "Therefore you can't use this text to determine if some of this texts are language like or not." I mean that there is a difference if you search a feature typical for languages in the VMS or if you search for a feature typical for the VMS in languages. My test shows that the compared texts in Latin, English, German and Arabic are not like the text of the VMS. This way I have compared a feature typical for the VMS. In my eyes this is more then nothing.

Your tests also showed that Arabic is not like German, German not like Latin, and Latin not like Arabic. Yet all of those are natural languages. You have not shown that there is a language-like co-occurrence pattern which all natural languages fit and the Voynich text does not. Until you do this nobody can possibly judge whether such patterns are language-like or not.

All you can possible say, at the very most, is that the Voynich manuscript is not written in Arabic, English, Latin, or German. At most.
Quote:All you can possible say, at the very most, is that the Voynich manuscript is not written in Arabic, English, Latin, or German. 

I used text from different text classes like a poem and verses. If you know a text class which is more repetitive then a religious text then indeed you can proof me wrong.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20