23-10-2019, 05:53 PM
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.One of my main problems with the method is that it is not very well formulated, and this does not allow testing / verifying it. Just an example: If a recipe for a cake gives a list of ingredients (eggs, flour, etc) and then says: "put all ingredients together and apply heat", then the recipe is too unspecific.
The main problem in formulating an answer to your concerns is that it would at least require to know your concerns.
Just an example: in your last paper you write about asking the right question: "The standard question about the Voynich MS is: 'What does it say?' This question may not have an answer. It may not say anything. ... The right question should be: 'How was it done?', because this question definitely has an answer. It was most certainly ‘done’ one way or another, also if the text is meaningless" (You are not allowed to view links. Register or Login to view.). But if it comes to research asking "How it was done?" you only write: "Examples of people who are doing very different things are Rugg (2004) and Timm and Schinner (2019)" (You are not allowed to view links. Register or Login to view.).
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.What I tried to get from re-reading is how specific the 'prescription' of the auto-copying is. In the most recent paper, there are suggestions about the type of changes applied, but so far I could not get anything on 'how far back' the author/scribe was looking. I have not looked into the code of the application, but there must clearly be some assumptions in there.
On page 10 in Timm & Schinner 2019 we write:
"As for the actual selection process of source words, it is clear from the results of Section 2 (as well as simply suggested by the scribe’s convenience) that they are to be chosen at least from the same page. Because it is handy to copy a word from the same position some lines above (see Timm 2014, p. 18), our implementation of the algorithm includes a mechanism that selects (with a given probability) even tokens from the previous line at the same writing position." (You are not allowed to view links. Register or Login to view., p.10).
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.What also seems to be missing is the initialisation. How was it started? This may seem a trivial detail, but again there must be something in the code for that.
On page 10 in Timm & Schinner 2019 we write:
"When starting to write on an empty page, it is necessary to choose some initial words from another one, in order to initialize the algorithm.
Footnote: There was a similar problem for the author of the VMS every time he/she was starting a new (empty) page. In such a case it was probably useful to use another page as source. There is some evidence that the scribe preferred the last completed sheet for this purpose (see Timm 2014, p. 16)." (You are not allowed to view links. Register or Login to view., p.10).
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.It is not written down specifically (and it is again something that I wanted to double-check) but it seems to be implied that every word in the MS (after the initialisation) is the result of auto-copying. That is, there are no words that are 'new seeds' or incidental re-initialisations.
(I have asked earlier in this thread about this, but I think that this question was understood in a different way).
In any case, these points clarify that the initialisation procedure is too important just not to mention.
I have answered your question You are not allowed to view links. Register or Login to view..
It seems that you start with the idea that it is possible to distinguish between a creative initialization phase and a static copying phase and that this way there is something like a fixed starting set of seed words and copying rules. This is the way a computer program would work but it is a misunderstanding of the way a human mind would execute such a method.
First, after a word is written down it becomes a potential source for generating new words. This way every word token has some impact on the text generation method or with your words every written word also re-initializes the method a bit.
Secondly, keep in mind that "the VMS was created by a human writer who had complete freedom to vary some details of the generating algorithm on the spur of a moment" (Timm & Schinner 2019, p. 19). It was always possible for the scribe to add new ideas to the text generation process. For instance he added new glyph shapes or tried new ways to generate a word. For a complete set of modification rules it would be necessary to reconstruct every thought the scribe has had some 500 years ago.
Just some examples:
- The 'x'-glyph only occurs on certain pages (see You are not allowed to view links. Register or Login to view., p. 13).
- Some ways to modify words are used frequently, others change over time and some are used only a limited number of times. See for instance the word <chey>. In Currier A it occurs mainly beside words like <shey> or <chy> and in Currier B beside words like <chedy>, <shey> and <shedy> (see You are not allowed to view links. Register or Login to view.).
- Or see the repetition pattern for 'okeoke' and 'oteote'. There are only four pages within the Astronomical and Zodiac section using such glyph sequences (see You are not allowed to view links. Register or Login to view., p. 9):
<f70v2.R3.1> okey ... oteotey ... oteoteotsho
<f71r.R1.1> oky ... okeoky oteody ... okeokeokeody
<f72v2.R2.1> otey ... oteotey oteoldy
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Then, if this assumption (no new seeds) is true, one could verify the auto-copying hypothesis by checking for each word in the MS if there is a recent (how far back?) similar word (which max. edit distance?) from which it could be derived.
It would also be necessary to consider modification rule three (see Timm & Schinner 2019, p. 10). Rule three is about combining two source words. If the scribe used <ol> and <chedy> to generate words like <olchedy> it simply doesn't make much sense to calculate the edit distance between <ol> and <olchedy> (see also the answer given You are not allowed to view links. Register or Login to view.).
(23-10-2019, 05:29 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.This seems to be the most basic test of the method, but I remember no evidence that this was even attempted. Again, something I would still need to check in the earlier papers. They are very long....
Please see chapter 3 "Evidence" in Timm 2014. In this chapter I describe this type of test for two control samples: "In this paper all words occurring seven times and all words occurring eight times are used as two separate control samples. ..." (You are not allowed to view links. Register or Login to view., p. 12ff)
In Timm & Schinner 2019 we also write about a test for the whole VMS. For this test we check if it is possible to describe the text as a network of similar word. The outcome of this test was:
"How does this situation change when we look at the entire VMS? Figure 2 shows the resulting network, connecting 6,796 out of 8,026 words (= 84.67%). Again, an edge indicates that two words differ by just one glyph. The longest path within this network has a length of 21 steps, substantiating its surprisingly high connectivity" (Timm & Schinner 2019, p. 5).
If it comes to 'isolated' words within the network the result is as follows:
"The respective frequency counts confirm the general principle: high-frequency tokens also tend to have high numbers of similar words. This is illustrated in greater detail in Figure 3: 'isolated' words (i.e., unconnected nodes in the graph) usually appear just once in the entire VMS, while the most frequent token <daiin> (836 occurrences) has 36 counterparts with edit distance 1. Note that most of these 'isolated' words can be seen as concatenations of more frequent words (e.g. <polcheolkain> = <pol> + <cheol> + <kain>)" (Timm & Schinner 2019, p. 5).
For the whole manuscript only 229 word types exists (229 out of 8027 types = 2.85 %) which differ in more then two glyphs to all other word types occurring in the VMS (see You are not allowed to view links. Register or Login to view.). Two typical words of this kind are <okeokeokeody> and <okeeolkcheey>. All 229 types occur only once and it is possible to split them into two or more words also occurring in the VMS. It is for instance possible to split the word <okeo keo keody> into three and the word <okeeol kcheey> into two words. With other words I was unable to find a single word that can't be explained by the 'self-citation' hypothesis.