The Voynich Ninja

Full Version: A possible generating algorithm of the Voynich manuscript
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7
(05-06-2019, 10:34 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.For me this includes:
- some explanation what caused the word patterns (even if it is just a better pattern scheme than we have now).
- some explanation about the differences between the A and B languages

For what it's worth, with a simple (to program) process I made good-looking fake Voynichese just by picking glyph trigrams randomly in a transcription whenever the two first glyphs of the trigram matched the two last glyphs of the line being constructed, with special characters for <start of line> and <end of line>. By construction the generated text kept most of the word patterns intact and had the correct frequencies at start of line, start of word, across word breaks, end of word and end of line. Of course the middle of words sometimes looked odd and I don't expect that a medieval scribe would use a list of trigrams with associated probabilities, there has to be a simpler explanation.
The repeating words or characters can be explained in many different ways, as one can see when one takes another text than that of the VMS.

a) One could take an similar other manuscript text and compare those results with any results one has gathered on the VMS.  
Of course you should apply the same logic and methods for both and compare the results.

Then when you are happy at one point, look at the total counts. (counting total letters, bigrams etc.) 
You will see that there is a big difference between the unique words/bigrams/trigrams/f.a. etc. counts. The only way to solve that is to remove letters until the count of uniques diminishes. However, doing so, in a sufficient manner will finally lead to a text where almost every letter has been removed. That final text that is left over has no sensible meaning or can not be deciphered into a meaningful text (yet).

b) A possible explanation of such behavior of the VMS text, as we see it, is the "random" insertion of characters.
Since that is the exact opposite of the method a) proposed, that also has no significant intrinsic meaning for the text as a whole.
This is what under discussion here and is using an approach that can only be validated by proving the opposite as written in a) 
From the perspective of a "codebreaker" (one that has knowledge in decipherment) there is little difference in the text in b) and the text in a) because we can see, count and compare the text specifics at any point with any language.

An example for those faint of heart: 
a) removal method: the lazy fox jumps -> remove vowels -> th lz fx jmps _ remove low freq. ltrs -> thlfmps
b) insertion method: thlfmps -> tah alaf amaps
(05-06-2019, 04:56 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I did not want ...

Please note, that we use the algorithm to create a "facsimile" of the VMS "Recipes" section. We keep the algorithm as simple as possible to demonstrate that even with a simple implementation it is possible to reproduce the intriguing key properties of the original text, including the presence of long-range correlations, the "binomial-like" word length distribution, and both of Zipf’s laws (You are not allowed to view links. Register or Login to view., p. 2).

You say that modifications are not at all arbitrary. Indeed, we describe rules as "Replace one ore more glyphs by similar ones", "Add or remove a prefix" and  "Combine two source words to create a new word" to modify tokens (You are not allowed to view links. Register or Login to view., p. 10). 

We also explain the reason for the word patterns (You are not allowed to view links. Register or Login to view., p. 10f). On page 10 two reasons are explained. One reason is that a glyph in the VMS is used because of its shape. "The shape of a glyph must be compatible with the shape of the previous one and is also influenced by its position within a word or a line" (You are not allowed to view links. Register or Login to view., p. 10).

You also give some examples of repeated elements that are easy to spot. This examples confirm local repetition and that glyphs are used because of there shape. They are in fact evidence in favor of the self-citation hypothesis.

Every time the scribe was starting a new (empty) page it was necessary to initialize the self-citation method by generating an initial line. To generate the first token it would be possible to use another page as source. Moreover, a source token from a different context was also used for the initial token of a paragraph (see You are not allowed to view links. Register or Login to view., p. 16). The scribe was also using additional gallow glyphs in order to generate unique tokens within the first line of a paragraph (You are not allowed to view links. Register or Login to view., p. 26ff)

This is confirmed by two patterns:
- In 86 % the start of a paragraph is highlighted by a gallow glyph. This confirms that a glyph is used because of its shape (see chapter 8 "The line as a functional entity" in You are not allowed to view links. Register or Login to view., p. 18) 
- That the paragraph-initial characters become repeated within the first line confirms local repetition and the rule that it is possible to replace a glyph by similar ones (see chapter 10 The paragraph as a functional entity in You are not allowed to view links. Register or Login to view., p. 26ff).

"The first and the last word in each line are easy to spot, the most obvious way is to pick them as a source for the generation of a group at the beginning or at the end of a line" (You are not allowed to view links. Register or Login to view., p. 19). It is therefore expected for the self-citation method that some line initial and line final repetition occur. 

- The first group in a line usually has a prefix like <y> and <o> or <d> and <s>. Since <y> is similar to <o> and <d> is similar to <s> this is another example of local repetition for similar glyphs (see chapter 8 "The line as a functional entity" in You are not allowed to view links. Register or Login to view. p. 19ff).
- On pages like f1r/f1v, f2r/f2v, f4r/f4v, ... Eva-m is rarely used, on pages like f3r/f3v and f6r/f6v Eva-m is used word final and only later Eva-m becomes line final (see You are not allowed to view links. Register or Login to view.). On page You are not allowed to view links. Register or Login to view. even paragraph initial words like <tsheoarom> and <pcheoldom> contain Eva-m. This observation confirms local repetition and also that a glyph is used because of its shape (see chapter 8 "The line as a functional entity" in You are not allowed to view links. Register or Login to view., p. 19). 

You argue that not much arbitrariness is left. In fact we demonstrate that tokens depend on their position to similar tokens (see You are not allowed to view links. Register or Login to view., p. 2ff).  But words in a 'code' vocabulary, a synthetic language or characters in a script didn't depend on each other. They depend on something that shouldn't be visible. They should depend on the meaning they are carrying! The high regularities of the VMS text is therefore our key argument for a meaningless text (You are not allowed to view links. Register or Login to view., p. 17).

For the gradual evolution of a single system from Currier A to Currier B see You are not allowed to view links. Register or Login to view. (p. 7) and also You are not allowed to view links. Register or Login to view. (p. 25). It is expected for the self-citation method that spelling variants invented while writing did not appear on pages already completed. This leads to the conclusion that the pages using Currier B were written after the pages in Currier A (You are not allowed to view links. Register or Login to view., p. 25).
It does not have to be so complicated;  A possible generation can be achieved also by using one of these letters [o c a y] (You are not allowed to view links. Register or Login to view.) and then add between 1 and 7 random other lettersGROUPS. I can write a paper on that and defend it easily. But I see no point in that. 
What is more important is finding a possible simple medieval method or system which is plausible as a whole.
Torsten, your analysis is a powerful argument against the hypothesis of a verbal language, but defending the hoax hypothesis is as absurd as defending the verbal language. As René says there is more planning and system in the VM, rules which are no identified. 
  The problem is to think that there is a text in the VM. What there is is a code of symbols in which each glyph represents a real referent. It is an astronomical system and your theory fits well with it because it describes the movement of symbol strings. The celestial objects move constantly and change position and that is what describes your method of similar tokens in close vicinity. For me, your theory is a good representation, step by step, of the movement in the sky. The Curve-Line system helps to understand it. The © symbol is simply the Moon.
  Of course, the VM is meaningful but it is necessary to understand its inherent philosophy.
The current situation with two threads discussing this paper in parallel is confusing. Please refer to the discussion thread for further.. discussion:
You are not allowed to view links. Register or Login to view.
Pages: 1 2 3 4 5 6 7