The Voynich Ninja
The location of <aiin> and <ain> groups - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: The location of <aiin> and <ain> groups (/thread-1517.html)

Pages: 1 2 3


The location of <aiin> and <ain> groups - ThomasCoon - 11-02-2017

Whenever I am analyzing the VMS text, I start with objective facts and then try to figure out the paradigm they fit into. Here is a fact about the placement of "aiin" and "ain":

<aiin> and <ain> appear word-initial over 500 times but are never once line-initial.

To me, this is an argument against the following theories:
  • The Autocopying hypothesis: If the VMS scribe was just manufacturing text through autocopying, why is there this very strict rule about never placing <aiin> at the beginning of a line - especially when <aiin> can be the beginning of a word (or a standalone word) in at least five hundred other instances? Wouldn't we expect at least one occurrence of <aiin> beginning a line, instead of 500-to-0?

  • The "Natural language" hypothesis: if Voynich is a natural language written in an unknown script, it would be odd that certain words can be found anywhere else in a paragraph but never at the beginning of a line of text (*note, I didn't say "beginning of a sentence")  
Discuss!  Big Grin


RE: The location of <aiin> and <ain> groups - Koen G - 11-02-2017

Hehehe Thomas, it's a good observation but bad conclusions Tongue
I guess what you describe would be considered a "LAAFU" phenomenon, and those can be explained in many ways.

Autocopying: anyone arguing that the VM has been somehow procedurally generated could explain this by saying that a new line resets or changes the procedure. If the procedure never causes aiin at the start, then this is sufficient to explain the observation.

Natural language: this is trickier because language is not maths or logical procedure. So one could even need "psychological" arguments as an explanation. It seems unlikely that each line is a separate sentence, but what if, for example, phrases don't bridge lines? So the scribe would abbreviate or arrange his text to fit the last complete syntactic block on the current line. And some words can't start a phrase.

Alternatively, and perhaps more likely, we could say that "aiin" is a word like "and". In Latin for example, this can be expresses as "et" and "-que". Imagine the scribe uses "-que" at the end of a line. This prevents "et" from ever being used at the beginning.

A final option is something I've seen Emma propose, which involves the scribe adding a sound to the beginning of lines because he's only used to speaking this language, not writing it. When starting a new line, he would add a sound which is normally only added in fluent spoken language. She can explain it better Big Grin

Just to say, there are many possible explanations.


RE: The location of <aiin> and <ain> groups - Torsten - 11-02-2017

(11-02-2017, 01:57 AM)ThomasCoon Wrote: You are not allowed to view links. Register or Login to view.<aiin> and <ain> appear word-initial over 500 times but are never once line-initial.

There is much more you can say about line initial words [font=Trebuchet MS][font=Trebuchet MS][see[/font][font=Courier] You are not allowed to view links. Register or Login to view.[/font][font=Trebuchet MS]][/font]:[/font]

Some elements are typical for a specific position within a line. For instance, 62% of the occurrences of the 'm'-glyph are at the end of a line. 

The first word of a paragraph is usually highlighted by an additional gallow glyph ('k', 't', 'p', 'f') as the first sign. This is the case in 617 out of 716 paragraphs (86%). Within a paragraph the words at the beginning of a line frequently start with a glyph 'y', 'o', 'd' or 's'. This occurs more frequently at the beginning of a line (68%) than this happens within a line (50%). [font=Trebuchet MS]The word 'daiin' occurs line initial 158 times, the word 'saiin' 59 times, the word 'dain' 49 and the word 'sain' 36 times.[/font]

Therefore the average length of the first words within a line increases. Moreover the second word is shorter than the first word in 48% of the lines and longer in only 32%.  Statistically, the second word in a line occurs twice as often as a subgroup of the first word (2.6%) than this is the case for any other words in a given line (1.3%). 

This observations can be explained as an unintended side effect of the autocopying method[font=Trebuchet MS]. The source for the first word in each line could only be found within the previous lines. Since the first and the last word in each line are easy to spot, the most obvious way is to pick them as a source for the generation of a word at the beginning or at the end of a line.  For the second word it is also possible to select the first word as a source. Since the first word in a line usually has a prefix the simplest change is to remove this prefix.[/font]

Within a cipher the paragraph and line initial glyphs could be stand for markers of some kind.

[font=Trebuchet MS]For the assumption of a natural language it would be indeed hard to explain that the text responds to the page he is written on. [/font]


RE: The location of <aiin> and <ain> groups - Emma May Smith - 11-02-2017

Hi Thomas, Koen gives a pretty fair summary of my thoughts. Basically, a the beginning of a line, the character [s] is added on to words beginning [a] and some words beginning [o]; the characters [y] and [d] are added on to many words beginning [ch, sh]. These characters are additional and should be removed to find the underlying word.

I think the cause may be a process known as sandhi, where the sounds of neighbouring words (or syllables) influence one another. It's a very common process and may have been quite normal in Voynichese. But the line break, for some reason, made the scribe spell it out. It could have been simple wariness on the part of the scribe that the reader might not remember the word at the end of the line before, or it could be that the scribe considered the line break itself somehow linguistic—that is, an actual break in the text, or pausa.

There are other line patterns in the text where the statistics for certain words change depending on their position in the line. Another one, similar the one you point out, is that word initial [k] is less common than word initial [t], even excluding Grove Words. Voynichese.com gives 216 occurrences of word initial [t] in such a position, but only 27 of word initial [k]. This may turn out to be an important part of the puzzle.


RE: The location of <aiin> and <ain> groups - nickpelling - 11-02-2017

It's a good observation to be starting from. The implication is that at least some of the single characters before aiin at the start of a line are in some way nulls.

For me, the interesting question is whether the proportion of saiin to (non-s)aiin at the line start is the same as the proportion of aiin to (single-letter)aiin. If it is , it would give support to the suggestion that line-initial s- is a null.


RE: The location of <aiin> and <ain> groups - Emma May Smith - 11-02-2017

I think we have to be careful about making this only about [aiin] and [ain]. Of the Nearly 2,000 tokens that begin [a] less than 1.5% occur at the beginning of lines. Of the 500 or so tokens which begin [sa], nearly 40% occur at the beginning of lines. Likewise, words beginning [so], [dch], [dsh], [ych], and [ysh], are also strongly linked to this position.


RE: The location of <aiin> and <ain> groups - Torsten - 11-02-2017

(11-02-2017, 01:33 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.There are other line patterns in the text where the statistics for certain words change depending on their position in the line. Another one, similar the one you point out, is that word initial [k] is less common than word initial [t], even excluding Grove Words. Voynichese.com gives 216 occurrences of word initial [t] in such a position, but only 27 of word initial [k]. This may turn out to be an important part of the puzzle.

Interesting observation. In line initial position and word initial position 't' and 'p' are more more common then 'k' and 'f'. 

                       k     t     p    f 
[font=Courier New][font=Courier New]line initial         118   405  [/font][font=Courier New] [/font][font=Courier New]385  [/font][font=Courier New] [/font][font=Courier New]41[/font][/font]

                       k     t     p    f 
word start          1158  1065   545  124    
word medial+final   9778  5898  1085  381    
total              10936  6963  1630  505


RE: The location of <aiin> and <ain> groups - nickpelling - 12-02-2017

(11-02-2017, 12:34 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.This observations can be explained as an unintended side effect of the autocopying method[font=Trebuchet MS]. The source for the first word in each line could only be found within the previous lines. Since the first and the last word in each line are easy to spot, the most obvious way is to pick them as a source for the generation of a word at the beginning or at the end of a line.  For the second word it is also possible to select the first word as a source. Since the first word in a line usually has a prefix the simplest change is to remove this prefix.[/font]

But isn't there a completely different statistical result (I vaguely recall Mark Perakh mentioning it some years ago, but I suspect that even by then it was a commonplace) that says that the first word of each line is on average slightly longer than all the other words in the line, not just the second word?

The proposal that this kind of thing holds true for the first word of a page or paragraph is now well-established in Voynich analysis, but less so for non-paragraph-initial and non-page-initial words.

If this is correct for all lines (not just paragraph-initial and page-initial lines), then it strongly suggests that (a) it is not the first word in a line that gets shortened into some second word by having some putative prefix removed and/or by autocopying, but instead (b) that the first word is typically longer than all the other words because an entirely separate process is going on there, one that prepends an extra letter to the first word of each line.


RE: The location of <aiin> and <ain> groups - Anton - 12-02-2017

Quote:<aiin> and <ain> appear word-initial over 500 times

Are there examples of vords beginning with aiin or ain, except for these two vords themselves?


RE: The location of <aiin> and <ain> groups - Emma May Smith - 12-02-2017

(12-02-2017, 12:21 AM)nickpelling Wrote: You are not allowed to view links. Register or Login to view.If this is correct for all lines (not just paragraph-initial and page-initial lines), then it strongly suggests that (a) it is not the first word in a line that gets shortened into some second word by having some putative prefix removed and/or by autocopying, but instead (b) that the first word is typically longer than all the other words because an entirely separate process is going on there, one that prepends an extra letter to the first word of each line.

It doesn't have to be every line, though, does it? Even every other line having an added character would up the average.