The Voynich Ninja
Sequential word repetitions in the VMS - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Sequential word repetitions in the VMS (/thread-61.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19


RE: Sequential word repetitions in the VMS - -Job- - 01-02-2016

(20-08-2015, 08:28 AM)david Wrote: You are not allowed to view links. Register or Login to view.
Quote:Is not the repetition issue too exaggerated?

I think the issue is not direct repetition, but rather many similar words repeated, as in Timms pairs and [ah-hem] Jackson sequences, which are not features of written languages.

I remember a while ago stumbling across a translation of an Egyptian religious text and it was incredibly repetitious.

In general, religious texts do seem to be more redundant (i'm sure there is a reason for this) e.g.:

Quote:God said, “Let there be light,” and there was light. 4 God saw the light, and saw that it was good. God divided the light from the darkness. 5 God called the light “day”, and the darkness he called “night”

It's hard to quantify, but in some sections the Voynich appears to exceed even that.

Some of the redundancy is simply a result of Stolfi's crust-mantle-core structure - lots of common word prefixes/suffixes.

I have a plot which highlights words according to the amount of information bits they carry (given by log_2 of the word's respective probability):
You are not allowed to view links. Register or Login to view.


The yellow areas contain the least amount of information - i would say that's where the VM's reputation for repetitiousness comes from.


RE: Sequential word repetitions in the VMS - davidjackson - 03-02-2016

Interesting feature of your site Job


RE: Sequential word repetitions in the VMS - Torsten - 04-02-2016

I have recently published a paper about this subject: "Co-Occurrence Patterns in the Voynich Manuscript"
Available via You are not allowed to view links. Register or Login to view.

Two important differences were found between the VMS and natural language. First, the same and similar words were used near to each other in the VMS text, whereas in in natural languages it is avoided to use the same or similar words in a row.  Second, the level of context dependency for the VMS text is more comprehensive since the same, and also similar, words were used near to each other.


RE: Sequential word repetitions in the VMS - crezac - 04-02-2016

Quite a bit of the discussion on this topic centers on whether word repetition reduces information content, means we aren't dealing with a natural language, blah blah blah, etc. etc. etc.  Those last two weren't comments on the nature of the discourse here, but examples of natural language English that repeats, carries meaning beyond what the phonemes do alone and isn't used in the kind of formal manuscript some people seem to be assuming VMS represents.  There is no reason to assume VMS is anything comparable to a document being prepared for publication. It could just as easily be lab notes, a spell book, someone's homework or some other type of practice text -- all of which you'd expect to find repetition in.

I think it's a safe assumption that whoever wrote it wanted to be able to get as much information out of it when he or she read it as there was put into it when it was written.  That's really just a definition of writing, but that's what we're working with here.  

Also safe, but not as safe, is the assumption that some of what is written represents nouns.  We can express things with repetition of nouns that are culturally significant in English. (Emphasis and disgust: crap, crap, crap, crap, crap or abstract references: Six Six Six or urgent demands: More More More)  As we add more parts of speech and grammar repetition still has roles to play, but our assumptions become reasonable rather than safe.  Or possibly reasonably safe. Shy

When we start getting into syntax and semantics, given that we aren't even that sure of the character set, our assumptions are recognizably assumptions and eventually we're going to be asked to provide a basis for them.  So basing them on preconceptions or the fact that lots of other people make the same assumptions might not be the best starting point.  We should at least try to have more than one set of assumptions so we have a fallback position.
 
The herbal section, recipes section, astronomy section, and so on are useful ways of organizing the manuscript for discussion, but they are all distinctions made on pictures rather than any analysis of the text.  So when we start trying to guess which words we will find where in the text based on something like word frequency, things get crazy.  I don't actually believe in reasonably crazy, so I can't qualify this one.
 
But that's another thread; where I'm going with this one is that on most of the levels of looking at VMS, I don't see word repetition or character repetition as a problem; especially if there isn't more of it than we see in English, Latin or whatever languages we want to translate Voynichese into. Maybe rather than considering if we have too much repetition we should consider where natural languages allow and/or require it.


RE: Sequential word repetitions in the VMS - Torsten - 04-02-2016

@crezac
A suprising feature of the VMS is its weak word order. But the order is only different then expected. The words co-occur with similar ones. Additionaly similar words have comparable frequencies. Both observations point to a relation for similar glyph sequences. The question therefore is: What is the relation between similar glyph sequences?

Context dependency alone, a poem or even a prayer is not enough to explain the relation between similar words. For natural languages already the numer of similar words is to small.

See for instance:
You are not allowed to view links. Register or Login to view.
or
You are not allowed to view links. Register or Login to view.


RE: Sequential word repetitions in the VMS - -Job- - 05-02-2016

The amount of word variability in the Voynich is one reason i'm not convinced the text is meaningful.

A while ago i assembled the You are not allowed to view links. Register or Login to view. page to illustrate this feature of the Voynich. Level 1 lists the set of words that
differ from chedy by a single character. Level 2, by two characters. And so on.

Level 1 has 50 words. Level 2 has 361 words.

It's plausible that the author produced the text by taking a previously written word, modifying it slightly, and writing it down.

That's certainly the process i would follow in order to fill a page with text, after the first few folios because it's purely mechanical, no thought required. In fact, i probably have done this before.

I can also see Zipf's law emerging in such a text given that common words are more likely to be reused.

It's consistent with several features of the text. What are the main arguments against this theory?


RE: Sequential word repetitions in the VMS - ReneZ - 05-02-2016

While I am fully undecided whether the text is meaningful or not, there is clear evidence of planning, and the creation as a whole was not a purely mechanical process.
The strongest argument is the different word distribution in the labels, i.e. these don't follow Zipf's law at all.


What I could imagine is the following process. (Note, I don't really believe this myself, but it is a possible model).
The author wanted to invent his own language, and made a list of the most common words - a dictionary.
His words (the Voynich words) follow some system. Let's say that he started with the 500 most common words.

As he started to translate a text into his own language, he needed to keep adding words to his list. His system (e.g. if it is number-like) would allow this. This could explain how similar words are found near each other....

There is however a big "if" here.

I wonder whether it is possible to create a dictionary, by which each Voynich word can be matched to one plain text word in some language in such a way that the result is a meaningful text in that language. I.e.: can the Voynich MS be translated word for word?
This is really independent of whether the system behind the MS text includes encryption or not. Suppose the text was encrypted using many nulls, then several different Voynich words map to the same plain text word.

My tendency is to think that the answer is no. (but of course I don't know).


I believe that a lot of people are assuming (possibly not consciously) that the answer is yes.
For me it's a question that divides all possible solutions into two halves.
There is much more to be said about this (e.g. the relevance of the word spaces) but that will be for another time.


RE: Sequential word repetitions in the VMS - Torsten - 05-02-2016

@Job 

I have published the auto copying hypothesis in 2014 (see You are not allowed to view links. Register or Login to view.).
So far, three responses were published: You are not allowed to view links. Register or Login to view., You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view..  
Two are in favour of my hypothesis and Gordon Rugg only says that he is not convinced. 
In summary I would say there are no published arguments against my hypothesis.

@ReneZ
For the label topic see You are not allowed to view links. Register or Login to view. p. 7 - 9.  The scribe probably used the labels less frequently as a source for the generation of other words. 

Your dictionary hypthesis does not explain that similar words do co-occur throughout the whole VMS (see You are not allowed to view links. Register or Login to view. and see You are not allowed to view links. Register or Login to view.). It is also not possible to explain this way that similarly spelled word types occur with predictable frequencies (see You are not allowed to view links. Register or Login to view. p. 6). They occur with comparable frequencies, whereas types which contain less frequent glyphs or bigrams in most cases occur less frequently (see You are not allowed to view links. Register or Login to view. p. 66-82).  


RE: Sequential word repetitions in the VMS - Emma May Smith - 06-02-2016

(05-02-2016, 12:19 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.@Job 

I have published the auto copying hypothesis in 2014 (see You are not allowed to view links. Register or Login to view.).
So far, three responses were published: You are not allowed to view links. Register or Login to view., You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view..  
Two are in favour of my hypothesis and Gordon Rugg only says that he is not convinced. 
In summary I would say there are no published arguments against my hypothesis.
Hi Torsten, I read your paper soon after it came out, but have only just published You are not allowed to view links. Register or Login to view. It is not the whole of my objections, but a key one.

In short: your theory relies upon the existence of a rigid structure for Voynich words, but this is inadequately handled and unresolved.


RE: Sequential word repetitions in the VMS - Torsten - 07-02-2016

Hi Emma,
 
 you interpret the set of rules as a way to generate the text. But the rules only describe the observations made for the VMS (see p. 14: ... describe the changes for similar glyph groups ...). Therefore rule II "Copy a glyph group and add one or more glyphs" doesn’t say that you must add random glyphs anywhere all the time if you generate new words. It only says that you can. This is a big difference! I should have expressed more clearly that the rules are only observations and that they can’t be used as instructions to generate the text.
 
Your suggestion that "the right two characters must be added in the right place, to make a valid word" is correct 97% of the time. For generating a text you have to add rules like that 'ch', 'sh', 'n', 'r', 's', 'l', 'd', 'm', 'q', 'k', 't', 'p', 'f' should not be consecutive (see You are not allowed to view links. Register or Login to view. as Timm 2015: p. 5). In fact, in most cases observation II is only used to add a prefix like 'l', 'o', 'ch' or 'q' (see Timm 2015: p. 5). Therefore on first glance it would make sense to reformulate the observations in a more strict form. But this would result in numerous exceptions like 'otkchedy' in <f104.P.17>, 'okdy' in <f103r.P.5>, 'qokdy' in <f105.P2.12>, 'okedyd' in <f59v.P.5>, 'dokedy' <f84v.P.11> etc.
You could try to eliminate these exceptions by interpreting them as errors. Unfortunately they are not errors. For instance beside the word 'qokdy' (4 times) also a word 'qopdy' (1 time) exist (see Timm 2015: p. 79). Furthermore also the words 'qokd'  (1 time) and 'okdy' (1 time) makes it hard to interpret the four instances of 'qokdy' as errors. Strict rules simply don’t work for the VMS. Therefore you can't use them. With other words, that for the VMS anything can happen doesn't mean that anything will happen all the time.
 
The explanation is your point that "if the writer could not simply make any alteration to any word, then it must be that each word has a small number of possible antecedents. Why would the writer limit themselves in such a way?" Indeed the scribe could have done any alternation to any word. But in the VMS we only see the changes he has made. The scribe did not limit himself to some rules he only repeated the same ideas most of the time. But he is not limited to a given set of rules. This can be shown by ideas used only one or two times. One example for such an idea is the change for the bigram "or" into "on" on page <f37v> (see You are not allowed to view links. Register or Login to view. as Timm 2016: p. 5). There are numerous other examples. See Timm 2015 p. 31-34 for some of them.
 
By the way, you can try it yourself and search for the source words for "errors" like 'otkchedy', 'okdy', 'qokdy', 'okedyd', 'dokedy' etc. Most times the source word is obvious in my eyes. For instance the word before 'dokedy' is 'qotedy', in the line above 'otkchedy' a word 'qokchedy' can be found and so forth. For me the examination of the exceptions was the key element to understand what is going on within the VMS.