The Voynich Ninja

Full Version: A possible generating algorithm of the Voynich manuscript
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7
(03-06-2019, 05:32 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(03-06-2019, 05:55 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The possibility that they are a side effect of some other process is not really addressed ('unintentional'). I am not aware of any evidence that speaks in favour of the intentional option, over the unintentional option.

You are not only moving the goalposts (see You are not allowed to view links. Register or Login to view.) you also ask for an impossible proof. The VMS is characterized by very specific patterns and statistics. Keep in mind that a method must explain the deep correlation between frequency, similarity, and spatial vicinity of tokens within the VMS text. It is therefore reasonable to assume that only one method results in this specific patterns and statistics.

Not at all.
What is new in your theory is that the vertical patterns and the presence of similar words near each other are caused by an intentional effort of the author(s) to create vertical patterns and similar words by auto-copying. That it is not the result (side-effect) of some other process. Within 'some other process' I also include the generation of a meaningful text.
To say that it is this intentional auto-copying and not something else requires some sort of evidence. This is what is missing, and I understand that you don't think it is possible to provide it.

To assume that only one method can result in the patterns observed is certainly not acceptable.

By the way, for me 'deep correlation' is way too strong.

(03-06-2019, 05:32 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(03-06-2019, 05:55 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The network of words with their edit distances only shows the existing words.
One has to imagine that inside this network there is a much denser network of other words, also with edit distance one (1) to the existing words, that do not occur in the text. 

Effectively, the network of existing words consists of very specific 'paths' through this denser network

This is an argument in favor of the self-citation method. Only with a systematic approach it would be possible to use every thinkable way to modify the tokens.

I don't agree at all.
If the method was arbitrary, lots and lots of other words with an edit distance of 1 or 2 from the existing ones would have been generated.
(02-06-2019, 05:03 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(02-06-2019, 04:32 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.The core network for page You are not allowed to view links. Register or Login to view. contains 172 out of 267 words (66.9 %). They represent 305 token out of 394 token (=77.4 %).
394 word tokens, yes, because two unclear words are removed. But surely there are 257 word types, not 267?
[chdy] is connected to [ody].
[attachment=2993]

[shkar] is not connected to [okar].
[attachment=2994]

Huh
(03-06-2019, 07:20 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Not at all. What is new in your theory is that the vertical patterns and the presence of similar words near each other are caused by an intentional effort of the author(s) to create vertical patterns and similar words by auto-copying. 

This is what I actual said: see chapter 6 "Evidence" in You are not allowed to view links. Register or Login to view., p.12f. In this chapter I provide evidence that similar words depend on each other:
"A feature of the VMS is that similarly spelled glyph groups are used together on the same pages near to each other. This means, the reason why similarly written words have similar frequencies is that they appear together on the same pages. In other words, the scribe was writing similarly spelled glyph groups near to each other because they depend in some way on each other" (You are not allowed to view links. Register or Login to view., p. 14). 

I also argue that "The most plausible explanation for the text generation method described in this paper is that the glyphs have no meaning. To use a glyph group already written as a source for generating another group is only efficient if it is possible to select any group" (You are not allowed to view links. Register or Login to view., p. 40).

There is nothing said about an intentional effort of the author(s) to create vertical patterns. This is just a misrepresentation of what I said.

(03-06-2019, 07:20 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.That it is not the result (side-effect) of some other process. Within 'some other process' I also include the generation of a meaningful text.

To say that it is this intentional auto-copying and not something else requires some sort of evidence. This is what is missing, and I understand that you don't think it is possible to provide it.

To assume that only one method can result in the patterns observed is certainly not acceptable.

Sorry, but the title of the paper is "You are not allowed to view links. Register or Login to view.".

It's on you to provide evidence for your claim that there is another method: "Although it may be possible to prove non-existence in special situations, such as showing that a container does not contain certain items, one cannot prove universal or absolute non-existence. The proof of existence must come from those who make the claims." (see You are not allowed to view links. Register or Login to view.)

(03-06-2019, 07:20 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I don't agree at all.
If the method was arbitrary, lots and lots of other words with an edit distance of 1 or 2 from the existing ones would have been generated.

Again, nobody argues that only words with ED<=1 where generated. See rule one for modifying tokens: "Replace one ore more glyphs by similar ones." ... "<chol> could be the origin of words like ... <shor>, <shar>, .... it is also possible to add an additional <e>-glyph, leading to words like <cheol> and <sheol>" (see You are not allowed to view links. Register or Login to view., p. 9).

Anyway, what happens if lots of words with an ED >1 from the existing are generated? After some iterations a dense network with ED=1 is generated. If for instance <chol> is modified into <sheol> and the next two steps are to modify <sheol> into <cheor> and <cheol> the result is a network of similar tokens with ED=1: <chol>, <cheol>, <sheol>, and <cheor>. This is exactly what happens in the VMS.
(03-06-2019, 09:11 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.[chdy] is connected to [ody].

<chdy> is connected to <dy> and <chody> is connected to <ody>. What you interpret as connection between <chdy> and <ody> is in fact the line between <chody> and <ody>. But it's indeed necessary to look into the gephi project to see that this is the case.
(03-06-2019, 09:22 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view....
This is what I actual said: see chapter 6 "Evidence" in You are not allowed to view links. Register or Login to view., p.12f. In this chapter I provide evidence that similar words depend on each other:
"A feature of the VMS is that similarly spelled glyph groups are used together on the same pages near to each other. This means, the reason why similarly written words have similar frequencies is that they appear together on the same pages. In other words, the scribe was writing similarly spelled glyph groups near to each other because they depend in some way on each other" (You are not allowed to view links. Register or Login to view., p. 14). 
...

This is a property of Voynichese that I have observed, so I am not going to argue that this phenomenon happens. It does.


But immediately after this statement, your paper moves to the "text generation method" section and in this section you describe "the changes for similar glyph groups occurring near to each other".

If I were writing about the same phenomenon, I probably would have described them in terms of "differences" and "similarities". The term "changes" (and the words "removed" and "added" that you use under the illustration) describe an active conscious copy-and-modify process.

Such an active process might, in fact, account for some of the properties of the VMS (I haven't decided yet), but I also think there are other dynamics that might account for these kinds of patterns in the text. Unfortunately, it's not something I can explain in a few words, just as your paper cannot easily be summarized in a few words.
(03-06-2019, 09:36 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.This is a property of Voynichese that I have observed, so I am not going to argue that this phenomenon happens. It does.

But immediately after this statement, your paper moves to the "text generation method" section and in this section you describe "the changes for similar glyph groups occurring near to each other".

If I were writing about the same phenomenon, I probably would have described them in terms of "differences" and "similarities". The term "changes" (and the words "removed" and "added" that you use under the illustration) describe an active conscious copy-and-modify process.

Indeed, I provided the self-citation hypothesis in 2014: "The statistical features of the text can be explained by the hypothesis that the author of the VMS was using the described self-referencing system to generate the text." (You are not allowed to view links. Register or Login to view., p. 17). I find it important to mention that the hypothesis is based on the results of an analysis of the text. One result is that "the closer two words are (with respect to their edit distance), the more likely these words also can be found written in close vicinity" and the other result is a "characteristic dependence of token frequency from word similarity" (You are not allowed to view links. Register or Login to view.). Anyway, if René was just providing a very short presentation of the self-citation hypothesis this is indeed not a problem in my eyes.

(03-06-2019, 09:36 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.Such an active process might, in fact, account for some of the properties of the VMS (I haven't decided yet), but I also think there are other dynamics that might account for these kinds of patterns in the text. Unfortunately, it's not something I can explain in a few words, just as your paper cannot easily be summarized in a few words.

I also searched for other dynamics. I didn't found anything that can't be explained by the self-citation hypothesis.
(03-06-2019, 07:20 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
(03-06-2019, 05:32 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(03-06-2019, 05:55 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.To assume that only one method can result in the patterns observed is certainly not acceptable.
[...]

Effectively, the network of existing words consists of very specific 'paths' through this denser network

This is an argument in favor of the self-citation method. Only with a systematic approach it would be possible to use every thinkable way to modify the tokens.

I don't agree at all.
If the method was arbitrary, lots and lots of other words with an edit distance of 1 or 2 from the existing ones would have been generated.

Randomness does not have to be uniform, it can be subject to rules, constraints and preferences. The new paper acknowledges the existence of something like the Curve-Line system and any number of preferences for esthetic reasons and whatever the scribe's inspiration was which are impossible to model accurately. This is vague enough to evade falsification. Confused

(Moderators: please move the discussion to the other thread if it makes more sense to keep the news thread separate.)
(04-06-2019, 04:24 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.Randomness does not have to be uniform, it can be subject to rules, constraints and preferences. The new paper acknowledges the existence of something like the Curve-Line system and any number of preferences for esthetic reasons and whatever the scribe's inspiration was which are impossible to model accurately. This is vague enough to evade falsification. Confused

I did not want to use the word 'random', which has a specific meaning in probability theory which is not equivalent with 'arbitrary'. In day-to-day use, random and arbitrary tend to be considered similar.

The theory is about the appearance of new words slightly modified from previous words. These modifications are not at all arbitrary, but follow specific (unknown) rules.
The most conspicuous features of the Voynich MS text, namely the existence of word patterns and the very limited ways in which characters combine, are not explained by the auto-copying process, but must be the result of additional constraints or rules, which are not identified.

As it stands, these word patterns and limited character combinations 'just happened'. In that sense, the theory is inadequate, even if it explains some other features.

But there are more known features of the text that are not explained by the auto-copying process:
- the fact that Eva-f and Eva-p mainly (but not exclusively) appear on top lines of paragraphs.
- that Eva-m and Eva-g occur only at word ends, and the former mainly at line ends
- that paragraph-initial characters are from a small subset only
- that line-initial words are different and follow some rules that are not yet understood

Of course, all of these can be accommodated by additional rules and constraints.
However, we then end up in a situation where there is not much arbitrariness left, and it is hard to maintain that all of these constraints cannot be the result of 'meaning'.

Furthermore, the fact that similar words tend to appear near each other could be explained by other effects, for example:
- building up the 'code' vocabulary as the text is produced
- Friedman's suggestion of some kind of a synthetic language
- every word stands only for one character, so the words are meaningless but the text as a whole is not

Now we have four options already, and I don't actually consider all of them likely. The first bullet above is the first thing that came to my mind when I read Torsten's first paper.

It should be noted that the existence of word patterns is a feature of the vocabulary, i.e. the corpus of valid words. This says nothing about the meaning of the text.
The meaning (or lack thereof) is related to the order in which the words have been picked from this corpus.
(05-06-2019, 04:56 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.As it stands, these word patterns and limited character combinations 'just happened'. In that sense, the theory is inadequate, even if it explains some other features.
Well, if there were enough constraints, the VMS could have "just happened" and we are left with the impossible task of explaining arbitrary choices. What didn't happen didn't happen, there does not have to be a reason for everything.

Quote:But there are more known features of the text that are not explained by the auto-copying process:
- the fact that Eva-f and Eva-p mainly (but not exclusively) appear on top lines of paragraphs.
- that Eva-m and Eva-g occur only at word ends, and the former mainly at line ends
- that paragraph-initial characters are from a small subset only
- that line-initial words are different and follow some rules that are not yet understood
Yes, logically, the existence of these rules and constraints (and there are more: the glyph correlations across word breaks as described in the recent paper by Emma May Smith & Marco Ponzi for example) make little sense in the hypothesis of an attempt to build a credible-looking but meaningless text and are unnecessarily sophisticated, especially those that are detectable only by frequency analysis.

Quote:Furthermore, the fact that similar words tend to appear near each other could be explained by other effects, for example:
- building up the 'code' vocabulary as the text is produced
- Friedman's suggestion of some kind of a synthetic language
- every word stands only for one character, so the words are meaningless but the text as a whole is not
My cipher allows 1) and 3) but not only one character per word, there could be zero, or more characters than glyphs (compression).
(05-06-2019, 09:11 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(05-06-2019, 04:56 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.As it stands, these word patterns and limited character combinations 'just happened'. In that sense, the theory is inadequate, even if it explains some other features.

Well, if there were enough constraints, the VMS could have "just happened" and we are left with the impossible task of explaining arbitrary choices. What didn't happen didn't happen, there does not have to be a reason for everything.

If someone manages to provide a convincing plain text, then we will be able to see what happened.
In that case, this meaningful plain text is a big part of the proof that this is correct.

Now in case the text is meaningless, then this plain text does not exist, and the rest of the evidence needs to be much stronger in order for this 'solution' to be convincing.

For me this includes:
- some explanation what caused the word patterns (even if it is just a better pattern scheme than we have now).
- some explanation about the differences between the A and B languages
Pages: 1 2 3 4 5 6 7