30-10-2019, 08:33 AM
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.It is not written down specifically (and it is again something that I wanted to double-check) but it seems to be implied that every word in the MS (after the initialisation) is the result of auto-copying. That is, there are no words that are 'new seeds' or incidental re-initialisations.
(I have asked earlier in this thread about this, but I think that this question was understood in a different way).
In any case, these points clarify that the initialisation procedure is too important just not to mention.
Then, if this assumption (no new seeds) is true, one could verify the auto-copying hypothesis by checking for each word in the MS if there is a recent (how far back?) similar word (which max. edit distance?) from which it could be derived.
This seems to be the most basic test of the method
I Rene,
I have been trying to do something along the lines you suggest, but I am not sure I could produce anything helpful or new. As always, I might have made mistakes in the process.
[attachment=3595]
This histogram measures the minimum Levenshtein distance of words from preceding words in the same page. The first 200 words of each page are considered. For distance=0, this means that a word-type is repeated and the value is complementary to TTR. As we already known from Koen's experiments, Timm's text behaves similarly to Q13.
I have included comparisons for Latin (Pliny) and Italian (Machiavelli). Here I have generated "pseudo-pages" by splitting the text at a fixed length. Again, as observed by Koen, Voynichese and Timm's generated text are close to Italian, with respect to TTR / distance=0.
Values for distance=1 depend on the phenomenon that Timm has analysed with his networks of words: Voynichese has values close to those for distance=0, while Latin and Italian drop at about half their distance=0 values.
I am facing two problems here:
1. definition of a meaningful quantitative measure;
2. definition of an acceptance threshold on that measure.
For instance, in the VMS, 87% of words have a distance from a previous word in the same page that is smaller than 3. Is this value enough to confirm Timm's theory? It certainly is considerably higher than in ordinary written languages and autocopying could explain this difference.
One could focus on the remaining 13% of words and see if there is something that cannot be accounted for by Timm's theory, but I am not sure how this could be done. Autocopying allows the combination of previous words to generate new words. Could shapchedyfeey in You are not allowed to view links. Register or Login to view. result from sho.pcheey.pchey in f8r? Maybe, with a sufficient effort, one could define a quantitative measure that tells us how likely this is, but at the moment I cannot think of anything that would add much to what we already know.
Timm and Schinner wrote that "the scribe had complete freedom to implement random personal aesthetic preferences, spontaneous impulses, or even idiosyncrasies". How can we exclude that shapchedyfeey results from a spontaneous impulse? I guess this sentence means that some deviation of the generated text from actual Voynichese must be regarded as acceptable. How do we set an acceptance threshold?