The Voynich Ninja
Are perfect-reduplication and quasi-reduplication related? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Are perfect-reduplication and quasi-reduplication related? (/thread-3107.html)

Pages: 1 2 3 4 5 6


Are perfect-reduplication and quasi-reduplication related? - MarcoP - 20-02-2020

By perfect-reduplication, I mean the exact consecutive repetition of the same word: e.g. daiin.daiin

By quasi-reduplication, I mean two consecutive words that are very similar to each other: e.g. qokedy.qokey

In You are not allowed to view links. Register or Login to view., I mentioned the transcription of the French You are not allowed to view links. Register or Login to view.. While discussing a different subject with Koen, I noticed that this file contains examples of both reduplication and quasi-reduplication. These examples are quite different from what we see in the VMS: the repetition occurs across lines, the first occurrence is written in red and entirely lower-case, the second occurrence is written in black with a red capital initial (the title of a recipe, written in red, often also occurs as the first word of the recipe).
These are three examples of perfect reduplication (brouet.brouet gruyau.gruyau coulis.coulis) and three examples of quasi-reduplication (blan.blanc escreissez.ecreuissez mulet.mullet). 
The "perfect" examples of course are not-so-perfect because of the differences listed above; the first case also has different 'r's, so one could argue that it is b2ouet.brouet (or b2ouet.Brouet). For the sake of argument, please ignore these differences.
   

The point I want to make is that here quasi-reduplication appears to be accidental: the two words are not really different, the differences are due to arbitrary spelling variation.

Semi-reduplication is a rarely discussed Voynich phenomenon. Timm and Schinner have provided a model for it. I may be wrong, but I understand they create reduplication by mean of this process:

1. words are randomly selected from a certain pool (a previous page)
2. when writing down a word, it can be slightly modified

Point 1 is responsible for both reduplication and quasi reduplication: word selection does not check that the new word is different from the immediately previous word.
Point 2 is largely responsible for quasi reduplication: a word identical to the previous one is selected and it is slightly altered before/when writing it on the page.
In this model, reduplication and quasi-reduplication appear to be closely related.


The other explanation of quasi-reduplication I am aware of is the You are not allowed to view links. Register or Login to view. encryption system by Rene. I undertand that here a nomenclator is created by adding new words as they are found in the text. Each word is replaced by a ciphered version that is very similar to the cipher version for the previous word - "the quick brown fox" becomes something like "2134 2135 2136 2137", but of course if a word is already present in the nomenclator, the cipher word in the nomenclator is used: this quasi-reduplication pattern only happens under certain conditions.
In this case, each minimal difference between two words is highly significant. It is no more true that similar words typically have similar meanings (as in plain text). Perfect-reduplication is an entirely unrelated phenomenon that is not explained by the cipher system but must have its source in the original plain text.

Obviously, the spelling variation that we see in plain text manuscripts (such as S 108) is more similar to what Timm and Schinner discuss.

I was wondering if there is any statistical method that could tell us if, in the VMS, quasi-reduplication is related with perfect reduplication or not. I.e. are there verifiable patterns that should be present if the two are related and absent if they are not?


RE: Are perfect-reduplication and quasi-reduplication related? - Koen G - 20-02-2020

I'm not sure about how to compare reduplication and quasi-reduplication. Is it possible to quantify the average edit distance between a word and the next one? In that case you could compare it to a shuffled text. This would show whether the large amount of quasi-reduplication is just the result of overall small edit distance between words or not. 

Basically, is the amount of word pairs with edit distance = 1 equal in a shuffled text? If this is the case, then there is no mechanism guiding the phenomenon of quasy-reduplication. I know this is beside your actual question but it might be a first step...


RE: Are perfect-reduplication and quasi-reduplication related? - davidjackson - 20-02-2020

(20-02-2020, 04:21 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.I was wondering if there is any statistical method that could tell us if, in the VMS, quasi-reduplication is related with perfect reduplication or not. I.e. are there verifiable patterns that should be present if the two are related and absent if they are not?

Once the cases are clearly defined it shouldn't be difficult to run a script to extract the information you want. Surely it's just a case of extracting the ratios for each term and comparing to the non-duplicated occurrences in the corpus? But I can't really see how you can compare the two, unless you assume the first word in the quasi-reduplication sequence is a "master" word.

I've used the term evolving epizeuxis for quasi-reduplication terms in the past. I like the concept of the words being repeated yet evolving, which gives them a continuity. But how to define such a phenomenon for statistical purposes? Are we talking about 1-edit or 2-edit? How long should a chain go?

For example: <f112r.P.6;H> chedal.oteedy.okeey.qokeedy.olkeedy.oteey.oram
[Image: jacksonseqf112r-6.jpg]
<f111r.P.3;H> dsheedy.lkeedy.chckhy.lchedy.qokeey.qokear.chal.qokeeas.cheokedy.sal.lokam
This example makes less sense in Eva, but if you start with the second word in this line and read along you can clearly see how the sequence evolves in a binary sequence. Words 1,3,6 form what is essentially an evolving epizeuxis, as do words 2,4,5,7.
[Image: Captura-de-pantalla-completa-06072015-160917.jpg]


And does a single duplication compare with a triple duplication? Or are they separate cases? All parameters you would have to define.


RE: Are perfect-reduplication and quasi-reduplication related? - davidjackson - 20-02-2020

PS - it would be more interesting to link reduplications to scribal hands, and see if they were more common to any particular scribe. If they were, then it would be a hint that they are scribal errors - dittographs.


RE: Are perfect-reduplication and quasi-reduplication related? - -JKP- - 20-02-2020

Maybe scribal errors, but maybe the scribe using a different system to generate the text, or working on a section with different content.


RE: Are perfect-reduplication and quasi-reduplication related? - MarcoP - 21-02-2020

(20-02-2020, 08:47 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.PS - it would be more interesting to link reduplications to scribal hands, and see if they were more common to any particular scribe. If they were, then it would be a hint that they are scribal errors - dittographs.

Here are the counts and % of reduplication and quasi reduplication on different sections - ZL translitaeration, no dubious spaces, only considering paragraph (P) lines. I don't know how close sections boundaries are to hand-boundaries, but these numbers suggest that both phenomena are widespread everywhere. The rows are sorted by increasing Reduplication %.

        N.Red %Red  N.Quasi %Quasi
_Herbal_B  20  0.612    63  1.928
_AstroCZ   11  0.626    24  1.367
_Pharma    16  0.738    44  2.03
_StarsQ20  75  0.743   223  2.21
_Herbal_A  68  0.907   180  2.401
_BioQ13    66  1.04    163  2.568


I counted more than 250 instances of exact reduplication (here a few text pages with no illustrations are not assigned to any section, so the total on the whole ms will be slightly higher).
For quasi reduplication, I counted EVA edit distance = 1, with both words at least 4 chars long. I found more than 600 occurrences.

In my opinion, these numbers rule out the possibility that these are all errors (some might still be of course). I believe that, in medieval manuscripts, dittographs only appear once in several tens of pages and only in some manuscripts (most manuscripts have no "exact-word-repetition" dittographs, thought they may have other more common types of errors). If one could show a long manuscript with an average of more than one exact-word dittograph per page (as perfect reduplication in the VMS) I would be very grateful. The idea that all these are errors is so reassuring, that it would be nice to be able to consider it. But I am afraid evidence is against this.

I found that the two % columns have a high correlation: 82% [EDIT: 0.85, thanks David!]. Could this be an element in favour of the two phenomena being related?


RE: Are perfect-reduplication and quasi-reduplication related? - -JKP- - 21-02-2020

I've been working on a blog about repetition for a couple of years, but I am so horribly behind (50 partly-written blogs at last count), that I am finally acknowledging that I simply can't find time to finish them all, so... maybe this info can do some good here...

Examples of various forms of repetition in manuscripts. This is a folder dump, I was organizing these according to categories for the blog, but I can't spare the time to explain each one right now (my apologies).

These are from Corpus Pelgianum which includes numerous forms of repetition (both adjacent and nonadjacent), both for words and Roman numerals:

    .      .     


These are from Add 11695 and Add 11390:

    .      


CLM 13002 (repetition is very common in manuscripts with sermons and prayers):

   


Examples in Hebrew manuscripts (Harley 5710):

    .      .     


And Armenian (Walters W.546):

   


Strasbourg 2.929:

   


Repetitive Phrases (ÖNB 3069 and Cotton Tiberius E iv and Valencia 58 and Czech DA III 2 and Cod. Sang 754 and BNF Latin 13025):

    .       .      .       .      .     


Slight variations and repetition of common words (Domitian A ix and Dresden M 163):

    .     


Repetition of words and names:

   


Repetitive word endings (Vatican Lat 1300):

   


Repetitive letterforms (Cotton Tiberius E iv and San Marco 212):

[attachment=4023].     


Repetitive abbreviations (Additional? 18359):

   

Repetitive pen tests (Sarajevo Haggadah):

   


I have more, but they tend to be in the same general categories.

It's 6:46 am, I've been up all night working. I know I'll never find enough time to follow this up, so hopefully Marco there is something in these examples that might be helpful to you.


RE: Are perfect-reduplication and quasi-reduplication related? - davidjackson - 21-02-2020

Thanks Marco! I agree that we can forget about it being just simply errors - 1 in every 100 words seems too high. 
There's something I don't understand here. The columns are linked. Take the ratio n. Quasi/n. Red
Then divide %quasi by that ratio, and you get %red. 

Why is that?


RE: Are perfect-reduplication and quasi-reduplication related? - Koen G - 21-02-2020

JKP: those are to many examples to process, no wonder you can't finish your posts Wink

Marco: I'm a bit worried though that high reduplication is an automatic effect of low TTR, see Q13 for example.


RE: Are perfect-reduplication and quasi-reduplication related? - MarcoP - 21-02-2020

David,
I am on my phone at the moment, but are you asking why n.rep/n.quasi=%rep/%quasi?
The % is N/number.of.couples.in.sample, so the two are related.
If you have more doubts, please explain and I will reply tomorrow.