The Voynich Ninja
Voynich text generator - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Voynich text generator (/thread-422.html)

Pages: 1 2 3 4 5


RE: Voynich text generator - Emma May Smith - 01-03-2016

(01-03-2016, 10:39 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
Quote:Well, there are still plenty of instances of <o> in the Bio section, just not <od>, so this explanation does not really make sense to me.


We can't know for sure what his motivation was. Maybe there was no motivation and its just an switch. 
...
Then there is no explanation at all, is there? If we are happy that the writer just switched as and when he wanted, we can be equally happy saying that he just wrote out the whole manuscript according to his whims. There's not even any need to say he copied and altered words from recent lines, is there? You can't explain why words are structured how they are, or why certain character combinations appear and change throughout the text, or really anything.

The "Whimsical Author" hypothesis: he wrote the Voynich Manuscript because he wanted to, invented a new script because he felt like it, structured the words just because he could, and altered them without any motivation. It explains nothing, but at least we've finally solved the mystery!


RE: Voynich text generator - Torsten - 01-03-2016

Quote:Then there is no explanation at all, is there?

It's clear that we can only guess about his motivation. We cant ask the scribe about his motivation for switching from words similar to 'chol' to words similar to 'chedy'.

Anyway, the motivation of the scribe only comes important if you accept that the scribe was generating the text with the Auto Copying Hypotheses.

Quote:You can't explain why words are structured how they are, or why certain character combinations appear and change throughout the text, or really anything.

For the VMS similar spelled types co-occur (see You are not allowed to view links. Register or Login to view.: p. 3-5) or the paper of Montemurro and Zanette [You are not allowed to view links. Register or Login to view.]. They occur with similar frequency whereas words, which contain less frequent bigrams, also occur less frequently (see the grid You are not allowed to view links. Register or Login to view. p. 66 - 82). For me the conclusion is inescapable. The words are structured as they are because they are copied from each other.


RE: Voynich text generator - Sam G - 02-03-2016

(01-03-2016, 10:39 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.We can't know for sure what his motivation was. Maybe there was no motivation and its just a switch. 

The question is whether or not your text copying method can account for the properties of the text without requiring the scribe to have been individually thinking about each property.  It doesn't seem like it can.

I think if you made a list of every property that the scribe would have had to have been specifically thinking about, it would be a very long list of very peculiar rules.


RE: Voynich text generator - Torsten - 02-03-2016

Quote:The question is whether or not your text copying method can account for the properties of the text without requiring the scribe to have been individually thinking about each property.


First, it is impossible to write without thinking. Second, it is obviously easier to copy words then to invent new words.

Quote:I think if you made a list of every property that the scribe would have had to have been specifically thinking about, it would be a very long list of very peculiar rules.


The main copying rules are rather simple.

The first rule is to replace a ligature or glyph with a similar one. I will demonstrate this with 'chedy' as example. The word 'chedy' consists of a ligature 'ch', an glyph 'e' and a ligature 'dy'. It is possible to modify each element. By changing 'ch' into 'sh' a new word 'shedy' would be generated. By  changing 'e' into 'ee' a new word 'cheedy' would be generated. By changing 'dy' into 'd' a new word 'ched' would be generated. 

The second modifying rule is to add a prefix like 'l', 'o', 'd', 'ch' or 'q'. For adding a prefix in front of another prefix some special rules exist: If for instance a prefix 'l' or 'o' is added in front of the glyphs 'ch' or 'd', these glyphs in seven out of ten cases change into a gallow glyph. This way a prefix 'o' added to a source word 'chedy' would result in words like 'okedy', 'otedy' or 'ochedy'.

The third rule is to combine two short words for generating a new word. For instance 'chedy' could be combined with 'ol' to generate a word 'olchedy' or 'olkedy'. For short glyph sequences it is also possible to combine the sequence with itself. For instance it is possible to combine two 'ol' words in order to generate a self-similar word 'olol'. 

It is also possible to combine parts of two or even three other words. Even if a word is rare this doesn't mean that it is a new invention. See for instance "tchodaiin" (3 times) and "daiiithy" (1 times) on page f51r: You are not allowed to view links. Register or Login to view.
A possible explanation for "tchodaiin" is that "tcho" and "daiin" are  connected to build a new word. Surprisingly, there is a "(y)kcho(l)" and 
a word "daiin" above of "tchodaiin" (see You are not allowed to view links. Register or Login to view. ). 
For "daiiithy" it seems that this word is a mix of "daiin" and "cthy". Again it is possible to explain this word from its context. There 
is a "(o)aiin" and a "ckh(ee)y" within the lines above "daiiithy" (see You are not allowed to view links. Register or Login to view. ).


RE: Voynich text generator - -Job- - 02-03-2016

In my opinion the copying hypothesis is very plausible, even if it's difficult to prove conclusively.

I can relate to the approach. It's effective and accessible, it does not require planning, it's organic.

It accounts for some unique features of the text, such as word variability and absence of repeated sequences. It's not incompatible with Zipf's law.

I expect that, over time, this type of text-generation process would converge into a set of rules (e.g. common substitutions), carried out mechanically, resulting in the peculiar combination of structured yet irregular text that we see in the VM.

Unfortunately, it's not a particularly testable theory because it is compatible with many different texts. The key question is, what verifiable features does it predict that we don't yet know about?


RE: Voynich text generator - Torsten - 04-03-2016

Quote:The key question is, what verifiable features does it predict that we don't yet know about?

For me this key feature was that rare words do co-occur with similar ones.  You can check this yourself. Choose a rule for selecting some low frequent types and check if this words do co-occur with similar ones.

For instance glyphs beside 'i' and 'e' occur rarely duplicated.  The bigram 'll' occurs 28 times and the bigram 'dd' occurs 23 times (see You are not allowed to view links. Register or Login to view.). If you check this words you will probably find patterns like the three 'dd' words on page You are not allowed to view links. Register or Login to view. (see You are not allowed to view links. Register or Login to view.).

Even for very rare words it is easy to find this type of pattern. The bigram 'an' occurs 118 times and the bigram 'on' only five times. But this doesn't mean that 'on'-words must be errors:  
You are not allowed to view links. Register or Login to view.


It is also possible to search the word with the highest number of similarities for each page. In this case you would frequently find pairs like 'chol' & 'chor' or 'chedy' & 'chedy':
    f1r  :  chol
    You are not allowed to view links. Register or Login to view.  :  chol

    f2r  :  chy
    You are not allowed to view links. Register or Login to view.  :  chor
    
    You are not allowed to view links. Register or Login to view.  :  chol
    You are not allowed to view links. Register or Login to view.  :  chor
    
    You are not allowed to view links. Register or Login to view.  :  chol
    You are not allowed to view links. Register or Login to view.  :  sho

    ...

    f82r  : chedy
    You are not allowed to view links. Register or Login to view.  : chedy

    You are not allowed to view links. Register or Login to view.  : chedy
    You are not allowed to view links. Register or Login to view.  : shedy

    ...


    You are not allowed to view links. Register or Login to view. : aiin
    You are not allowed to view links. Register or Login to view. : oaiin

    You are not allowed to view links. Register or Login to view. : chey
    f115v : chedy


Or you can check similarly spelled word types. Types which contain less frequent glyphs or bigrams in most cases occur less frequently:

 bigram                        - frequencies for the most common words with this glyph sequence
 cho : cha = 2552:468   - 'chol' (396 times) : 'char' (72 times)
 sho : sha = 949:127     - 'shol' (186 times) : 'shar' (34 times)

 qo : qa  = 5186:8        - ...
 ok : ak  = 5950:36
 ot : at   = 3767:11
 op : ap  = 560:6
 of : af   = 154:3

 ar : or = 3151:2723
 al : ol = 3002:5507


RE: Voynich text generator - -Job- - 04-03-2016

(04-03-2016, 02:05 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
Quote:The key question is, what verifiable features does it predict that we don't yet know about?

For me this key feature was that rare words do co-occur with similar ones.  You can check this yourself. Choose a rule for selecting some low frequent types and check if this words do co-occur with similar ones.

For instance glyphs beside 'i' and 'e' occur rarely duplicated.  The bigram 'll' occurs 28 times and the bigram 'dd' occurs 23 times (see You are not allowed to view links. Register or Login to view.). If you check this words you will probably find patterns like the three 'dd' words on page You are not allowed to view links. Register or Login to view. (see You are not allowed to view links. Register or Login to view.).

I half-agree. The occurrence of 'dd' and 'll are suspicious in You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view. respectively, but in other cases it's subject to interpretation.

The occurrence of non-terminal 'n' could be taken as another example:
You are not allowed to view links. Register or Login to view.

In the text produced by your generator, what's the ratio between vocabulary size and text size?

The VM has a ratio of 0.21. The A folios have a ratio of 0.3 and the B folios have a ratio of 0.21, so there is some internal variation.

The following are the ratios for sample texts in known languages:
Pliny's Natural History (Latin): 0.27
Bible (Latin): 0.22
Bible (Hebrew): 0.22
Dante (Italian): 0.22
Moby Dick (English): 0.11
Short Stories (Pinyin): 0.21

I expect that you should be able to match a ratio of 0.21 by constraining the copy operation. The question is, how much?


RE: Voynich text generator - Torsten - 05-03-2016

Quote:The occurrence of 'dd' and 'll are suspicious in You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view. respectively, but in other cases it's subject to interpretation.


I have checked all the 'dd' words. In my eyes more pages look suspicious.

f11v: 'dchy' - 'dy dy' - 'dchy' - 'dy ddy'
You are not allowed to view links. Register or Login to view.

f19v: '8om' - 'ddor' - 'dar' - 'or'
You are not allowed to view links. Register or Login to view.

f48v: 'tchdy chdy' - 'otchody ytchdy' - '8chedy tchddy'
You are not allowed to view links. Register or Login to view.

f102r1:
You are not allowed to view links. Register or Login to view.

f115v:
You are not allowed to view links. Register or Login to view.

Quote:In the text produced by your generator, what's the ratio between vocabulary size and text size?


With the generator algorithm it is possible to generate many different words and texts. If I add more rules for generating only voynich like words I would also limit the vocabulary. The question is therefore if I want to generate a text with similar statistics or a text with similar words.


RE: Voynich text generator - ReneZ - 05-03-2016

Note especially on You are not allowed to view links. Register or Login to view. a 'dd' in the left margin.
Often overlooked since it is not in most (or all) transcription files.

It's on the same line that has the 'dd' sequence in the main text.


RE: Voynich text generator - juergenw - 05-03-2016

There are more 'dd' in the rosette folio, which unfortunately isn't in the voynichese tool. Although I am not so sure if one of them is actually a word or a seperate display of 'd' and 'd'
I attach a screenshot