The Voynich Ninja
Sequential word repetitions in the VMS - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Sequential word repetitions in the VMS (/thread-61.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19


RE: Sequential word repetitions in the VMS - MarcoP - 05-09-2017

(04-09-2017, 06:17 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Thai uses reduplication for two cases, one to make a particular kind of plural, and the other resulting from the fact that some classifier words (something European languages tend not to use) are the same as the nouns they classify.

Hi Rene,
do you think this could come close to the frequency of reduplication in the VMS?
Exact and quasi-reduplication occur on 1% of word pairs each in the VMS. On 100 words, one can expect to see two instances of reduplication (in one of the two forms).

(04-09-2017, 06:17 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Conveniently, Thai usually does not repeat the word in writing, but uses a special symbol (ๆ) which just means: previous word repeated.

- - -

Edit (addition): however, very inconveniently, Thai does not use spaces to separate words. They are just strung together. So, while it would be very easy to count the reduplications in a text, it is not easy to automatically count the different words...

So wordๆ is the written form of "word word", is this correct?
Is quasi-reduplication possible? Things like "word xword" maybe written wordxๆ? If so, can the position of x vary (eg "xword word")?
Or are there other ways in which the two occurrences of the word are not perfectly identical?
Can triplication occur as well?

It's great to have the opportunity to see reduplication in action in a living language!


RE: Sequential word repetitions in the VMS - ReneZ - 05-09-2017

Hi Marco,

the ๆ symbol is only used for exact repetition. There is some quasi-reduplication in colloquial expressions, but the repeated part is usually shorter than the non-repeated part.
Typical example: "sanuk" mean fun, "sanuk-sanaan" also means fun but more elaborately.

I had a quick look around. Apart from reduplication of nouns, it is perhaps even more frequently used for adjectives, and the effect is of reducing the quality of the adjective a bit.
ngao = lonely. ngao-ngao = feeling a bit lonely.
I quickly bumped into a CD title which has four reduplicated words in succession, but that is certainly not usual.

1% might just be possible, depending on the type of text....


ฺัBy the way, Thai spelling is quite particular. The above word ngao-ngao is written:
เหงาๆ
The เ is the first half of 'ao'. The character เ is one of several that has to be at the start of a syllable (or word).
The ห is just there to change the tone from neutral to rising.
The ง is the 'ng'
The า is the second half of 'ao'.

The first character pronounced in this case is the third one.
This word is sorted in the dictionary under ห Smile


Taken together with the fact that about half the vowels are not written, and all words are strung together, one can  see that reading Thai for the beginner is quite a challenge.

I am not aware of triplication, except for laughter, which is mostly written 555 since 5 = ha. Perhaps also other exclamations.


RE: Sequential word repetitions in the VMS - MarcoP - 05-09-2017

This are figures comparing exactly repeating words in Currier A & B “languages”. All data are based on Takahashi's transcription.

Currier A
words: 9025
repetitions: 96
different repeating words: 45

Currier B
words: 20063
repetitions: 181
different repeating words: 74

Since the B corpus I considered is about twice as large than the A corpus, the overall numbers are quite consistent. Of course, when one looks at individual words, things differ.

These diagrams are sorted by the number of repetitions of each word. Green bars are the “expected” number of repetitions based on word frequencies. The red bars are the count of actual exact repetitions for each word (daiin.daiin, chedy.chedy, etc).
   
I this data make clear that word frequency doesn't explain the number of repetitions.
For instance, in language A, the number of occurrences of chol is half the number of occurrences of daiin, yet chol has many more repetitions.
The preference for the repetition of q- words is clearer in language B, which has common words with that prefix. But language A also has several q- words that repeat (even if only once).
Language B seems to have a larger repertoire of often-repeating words. Both languages have long “tails” of words whose repetition only occurs once.


These are the same data ordered by decreasing word frequency.
   


I want to point out the comparison between daiin and aiin in language B. The two words have similar frequencies, but aiin (which is slightly more frequent) never repeats while there are three occurrences of daiin.daiin.
In language A the frequencies of daiin and aiin are quite different, but again daiin repeats and aiin doesn't.

These are counts on the whole manuscript based on two different transcriptions.

              Takahashi  Zandbergen

daiin            750       697
daiin.daiin       16        11

aiin             419       296
aiin.aiin          0         0



RE: Sequential word repetitions in the VMS - Davidsch - 05-09-2017

(15-08-2015, 11:19 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Is there any work or resource where all sequential word repetitions in the VMS are listed? I mean not the "Timm's pairs" or "Jackson sequences" but exact repetitions, like
Code:
daiin daiin daiin
I've called this horizontal repeats, side by side words:  You are not allowed to view links. Register or Login to view.


RE: Sequential word repetitions in the VMS - MarcoP - 05-09-2017

(05-09-2017, 02:27 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.
(15-08-2015, 11:19 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Is there any work or resource where all sequential word repetitions in the VMS are listed? I mean not the "Timm's pairs" or "Jackson sequences" but exact repetitions, like
Code:
daiin daiin daiin
I've called this horizontal repeats, side by side words:  You are not allowed to view links. Register or Login to view.

Thank you, David!
The numbers you provide for exact repetitions are consistent with those discussed in this thread. 
The repeating words you list You are not allowed to view links. Register or Login to view. (the top 11 repeating words) total 115 exact repetitions. Numbers vary according to transcriptions and the details of the corpus used, but exact repetitions in the VMS are in the 250-300 range.

The Vulgate Genesis is shorter than the VMS (25K words ca). I count 10 exact repetitions in Genesis: this comparison confirms that exact repetitions are considerably more frequent (in this case, 20 times more frequent) in the VMS.

You are not allowed to view links. Register or Login to view.



RE: Sequential word repetitions in the VMS - MarcoP - 05-09-2017

(05-09-2017, 09:55 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Hi Marco,

the ๆ symbol is only used for exact repetition. There is some quasi-reduplication in colloquial expressions, but the repeated part is usually shorter than the non-repeated part.
Typical example: "sanuk" mean fun, "sanuk-sanaan" also means fun but more elaborately.

I had a quick look around. Apart from reduplication of nouns, it is perhaps even more frequently used for adjectives, and the effect is of reducing the quality of the adjective a bit.
ngao = lonely. ngao-ngao = feeling a bit lonely.
I quickly bumped into a CD title which has four reduplicated words in succession, but that is certainly not usual.

1% might just be possible, depending on the type of text....


ฺัBy the way, Thai spelling is quite particular. The above word ngao-ngao is written:
เหงาๆ
The เ is the first half of 'ao'. The character เ is one of several that has to be at the start of a syllable (or word).
The ห is just there to change the tone from neutral to rising.
The ง is the 'ng'
The า is the second half of 'ao'.

The first character pronounced in this case is the third one.
This word is sorted in the dictionary under ห Smile


Taken together with the fact that about half the vowels are not written, and all words are strung together, one can  see that reading Thai for the beginner is quite a challenge.

I am not aware of triplication, except for laughter, which is mostly written 555 since 5 = ha. Perhaps also other exclamations.

Thank you, Rene!
From your discussion, it seems that reduplication has very specific characteristics in Thai which only partially match what can be observed in Voynichese. It will be interesting to see how different languages behave and which ones come closer to what we observe in Voynichese. 

I have seen differences in pronunciation of the repeated word discussed about other languages. It could be that some of the quasi-repetitions actually denote these differences. But it is also true that the Voynich alphabet seems too small to give indications as detailed as those in your Thai example.


RE: Sequential word repetitions in the VMS - -JKP- - 05-09-2017

(05-09-2017, 09:55 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view....

I had a quick look around. Apart from reduplication of nouns, it is perhaps even more frequently used for adjectives, and the effect is of reducing the quality of the adjective a bit.
ngao = lonely. ngao-ngao = feeling a bit lonely.
...


I found this very interesting since it's the opposite of what usually happens in western languages, a duplicated word (whether noun or adjective) tends to intensify rather than diminish.


RE: Sequential word repetitions in the VMS - Koen G - 05-09-2017

JKP: one example I can think of in English is "so-so". This can intensify ("I love this forum so so much!") but also diminish: "The quality of this forum is so-so".

The difference is of course that diminishing reduplication is not productive in current Indo-European languages, it's only seen in a limited number of expressions.


RE: Sequential word repetitions in the VMS - MarcoP - 07-09-2017

I have collected some data about exact repetition and LAAFU. If I merge each paragraph into a single line, using Zandbergen's transcription, I count 262 exact repetitions.

Of these:
13 occur at the beginning of a line
16 occur at the end of a line
6 occur across lines

The expected value for each of the three case is 262/8.5=31 (where 8.5 is the average number of words in a line).


You are not allowed to view links. Register or Login to view.

I asked myself why exact repetitions in these positions are rarer, and for once I think there is a possible answer.

I have checked quasi-repetitions that add a one or two EVA characters prefix or a suffix to the first or second word of the repeating couple. Also in this case, I have focused on line borders. START is  line start. END line end. ACROSS means that the repetition has the first word instance at the end of a line and the second at the beginning of the following line.

PREFIX-1ST   pW.W
PREFIX-2ND  W.pW
SUFFIX-1ST    Ws.W
SUFFIX-2ND   W.Ws

            TOTAL  START  END  ACROSS
EXACT         262    13    16      6
PREFIX-1ST    216    50    15      6
PREFIX-2ND    182    12    13     13
SUFFIX-1ST     38     3     3      4
SUFFIX-2ND     63     7    16      7


Quasi repetitions of the different prefix/suffix types total 499 occurrences. Of course, it's reasonable to expect that some of these are coincidental.
These histograms are based on the above numbers. The diagram on the right presents percentages based on the totals.
   
It seems to me that these data confirm that transformations take place at the beginning and at the end of lines (see You are not allowed to view links. Register or Login to view.).

Exact repetitions rarely appear at line boundaries because, in those positions, they are often transformed into quasi-repetitions of the prefix/suffix type.

Prefixes are added to the first word of the pair when the reduplication takes place at the beginning of a line.
If we indicate with '|' a line break and with '.' a space between words, we observe (for instance) .chol.chol. and |ychol.chol. In 50 cases, in the line initial position, a prefix could have been added to the first word of the pair.
Examples:
<f17v.5,+P0>ychol.chol.dolcheey.tchol.dar.ckhy
<f111v.21,+P0>sair.air.ain.qol.rar.ain.cheey.lkeey.lkain.cheokain.sheo.qo.qokain.chear.alam

<f17v.13,+P0>ykeor.chol.chol.cthol.chkor.sheol
<f67r2.32,@Pb>dosar.odas.air.air.alaiin


Similarly, we observe .ol.ol. and -ol.oly| This phenomenon is less frequent, still 25% of quasi-repetitions with suffixed second words occur at the end of a line.

<f19r.12,+P0>ykchor.chor.daiin.daiinol-
<f81r.22,+P0>qotal.chedy.qol.ol.daiin.olchedar.ol.oly-

<f32v.8,+P0>otchol.daiin.daiin.cthodaiin.qotaiin.otchy.d.shan-
<f81r.4,+P0>dchedy.qokain.ol.ol.chcthy.ykeedyal

I think it's possible that reduplication appears to be less common at line boundaries because it is reduced by these phenomena, that transform it into quasi-reduplication. In the case of across-line reduplication, in principle both words could be altered, making the resulting combination harder to detect. This is something that deserves further investigation.

The 50 occurrences of line-initial quasi-repetition are enough to provide some details about the added prefixes. Considering the repetitions of the similar words chol and chor, one sees that different prefixes are applied: d-, o-, ot-, t-, y-, yk-, yt-.
Prefix t- is applied twice at paragraph start. 
But how other prefixes are selected is not completely clear. After a preceding line ending -in, both o- and y- appear. The prefix ot- appears to be added after lines ending -om, -od, -or.

You are not allowed to view links. Register or Login to view.
It seems to me that the observed data do not rise too many new questions and confirm other observations. Quasi-repetitions are partly explained by LAAFU effects, fitting with Emma's analysis of the phenomenon.
Anyway, this only account for about 10% of the prefix/suffix quasi repetitions. An analysis of line-middle quasi-repetitions could tell us if other deviations from exact repetition conform to Emma's Transformation Theory (i.e. adaptation to the preceding word-end).


RE: Sequential word repetitions in the VMS - Davidsch - 07-09-2017

@Marco
You could try the exact same but then:
* make SH= ch
* remove the gallows

Have fun.