The Voynich Ninja

Full Version: A case for Gibberish
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8
@Mark
I have to agree with JKP, sometimes it is even unbearable to read how often a word is repeated.
"From recipes: if it is hot, give it cold to drink. if it is cold, give it hot to drink. if it is dry, give it lots of cold to drink.
This goes on for 10 pages like this.
When it comes to bodily fluids, blood, sweat, urine...hot or cold, it gets worse when he writes about summer and winter.

Depending on the topic, it is also normal to repeat the same words almost immediately.
Even with whole sentences, not uncommon...example:

You are not allowed to view links. Register or Login to view.

At first I thought it was the writer, but actually all books are like that. It is not special.
(24-09-2020, 04:02 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.
I have seen many situations where looking deeper or wider can provide less-obvious answers.

Same here, but at times there may be no answer to be had. To quote the great epistemologist Kenny Rogers, "You've got to know when to hold 'em / Know when to fold 'em," I don't have a sense yet that we have a good way to know whether it's gibberish or whether it just looks like gibberish.
(24-09-2020, 10:57 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.For each bullet one can find plenty of cases where a meaningful text fully fits the description.
True, but is there is a meaningful text that fully fits all of the bullets? It won't do to that the number of hapax legomena fit a language like Georgian while the word structure fits a language like Chinese. Georgian and Chinese are completely different languages and use incompatible strategies.
Stephen, for me, many of the bullets don't even have anything to do with meaningful vs. meaningless, e.g. the bullet about punctuation.
Furthermore, there is a basic assumption that character groups are complete words, and this may not be true.

Equally importantly, a list can be made of points that argue in favour of meaning and language-like structure.
It will be equally difficult to find a meaningless text generation method that includes all of them.
It is precisely on such bullets that the methods of Rugg and Timm fail.

Like I wrote, for me it is a complete stand-off.
It is true, many of the bullets are more suggestive of lack of experience with medieval manuscripts than of gibberish. 

Now on the other hand, if you want to prove that the VM text is meaningless, the best you can do is list the indications that strengthen the case. 

Since all of these are either potential indicators or no indicators at all, the case for gibberish is not particularly strong.
What about a manuscript where the same label can be found on different pages with quite different images?

What about a manuscript where the repeated labels have very similar spellings from one to the other?

And what about a manuscript that has all the properties of labels combined that we have discussed?

Whilst an individual item might be explained the cumulative body of evidence seems inexplicable any other way.

(25-09-2020, 12:44 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.I've posted examples on my blog of cryptic-looking but very logical two- and three-character indexing systems.

I will look at that.
We don't even know how the VM encodes text. Maybe dissimilar words may end up looking similar or even the same. While this is not ideal and might cause ambiguity, it does not indicate complete lack of meaning.
(25-09-2020, 08:21 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.We don't even know how the VM encodes text. Maybe dissimilar words may end up looking similar or even the same. While this is not ideal and might cause ambiguity, it does not indicate complete lack of meaning.

The point for me is that whilst, yes, it is perfectly reasonable that for two words to be spelled similarly when one tries to explain all these features for me Occam's razor says the idea that those labels are meaningless is the best possible solution. However there are quite a few labels that don't behave like that and so for me it makes sense that those labels are meaningful.
Schinner's random-walk argument (see You are not allowed to view links. Register or Login to view.) is also prominent in the more recent paper by You are not allowed to view links. Register or Login to view. (p.11-p.14, Fig.8).

Timm&Schinnner' Wrote:An enigmatic property of the VMS reported in [12] is the presence of long-range correlations visible on the bit level of the text. They let the glyph sequences appear as a stochastic process with underlying Pólya-like distribution, rather than natural language.

I decided to try and look into it, but the results were mixed. I must say I find this measure quite complex and difficult to interpret and I am still unsure of what it means.

The process basically is:
  • the text is mapped into a long binary sequence; this is done by ignoring spaces and mapping each remaining character to a different binary pattern (e.g. 00000, 00001, 00010 etc.)
  • the binary sequence is converted into a path in a bi dimensional space: you start from 0,0 increasing X by 1 at each step; each 0 moves Y down (-1) and each 1 moves Y up (+1)
  • for each integer L, the differences in the Y values between points on the path with a distance L on the X axis is computed. For each L, you get a long list of numbers and compute its root mean square fluctuation F(L)
  • L and F(L) are plotted on a logarithmic chart
  • a value "alpha" corresponding to the slope of F(L) is computed

Schinner 2007 quotes Kokol et al. (1999) You are not allowed to view links. Register or Login to view..
Figure 2 gives a good summary of the process.


Schinner finds an alpha value of 0.846 for a Voynich EVA transliteration. Apparently, from this he infers that Voynichese cannot be a written natural language, but must be the result of a "stochastic  process". He also observes that the slope appears to change for L~360, linking this to the length of a line of text in the VMS.

Schinner Wrote:Previous investigations by Kokol et al. [8] of various human writings have demonstrated that for natural language texts (almost independent of the language used) the asymptotic exponent alpha of F(l) does not notably differ from 0.5... Most interestingly, the VMS text shows completely different behavior: a crossover point exists where the ‘‘random process’’ alpha~0.5 turns into an asymptotic exponent alpha~0.85, indicating the presence of ‘‘memory effects’’ in the underlying stochastic process. ... the crossover point L~360 (=72 characters x5 bits) of the whole text fits well to the average line length

But the statement attributed to Kokol that "for natural language texts ... the asymptotic exponent alpha of F(l) does not notably differ from 0.5" is very different from what Kokol wrote.

See Kokol's Table 3:
[attachment=4798]

Kokol only considered 3 languages and 20 texts for each language. One of these 60 samples resulted in alpha=0.72, which differs more from 0.5 than it does from 0.85 (the alpha value for VMS EVA).
Kokol et al Wrote:We see that the mean α for natural language texts is very near 0.5, but single texts differ from this critical value significantly.

Kokol et al Wrote:The difference in α between different writings can be attributed to various factors like personal preferences, used standards, language, type of the text or the problem being solved, type of the organisation in which the writer (or programmer) works, different syntactic, semantic, pragmatic rules etc.

Schinner 2007 also quotes Schenkel, A., J. Zhang, and Y. Zhang (1993) You are not allowed to view links. Register or Login to view., which appears to be the first paper where the application of this method to language analysis was discussed.

Schenkel et al. also examined several different texts and pointed out a bible (which I assume to be in English) that results in alpha=0.87, i.e. more than the value for VMS EVA. Their plot (Fig.1 i) also shows a change in alpha at 100<L<1000, the same behaviour that Schinner observed for the VMS.
It is so unfortunate that the four texts examined by Schinner were three bibles in Latin, German and Chinese, while for English he chose Alice in Wonderland!
[attachment=4797]

I wrote some python code to try and replicate Schinner's experiments. I am far from sure that it is correct. Anyway, these are the plots I get for:
  1. VMS EVA Takahashi transliteration
  2. Vulgate Latin Bible (first 25,000 words, to match Schinner's experiment)
  3. Alice in Wonderland (whole text, to match Schinner's experiment)
  4. King James English bible (first 191,000 non-space characters, to match the VMS)
[attachment=4796]
[attachment=4799]
I can only say what I do. It is also only one possibility
When I search for words and take similarities into account, it looks something like this.
If I try out different endings, it goes almost into the infinity of similar words. That in the same script.
Pages: 1 2 3 4 5 6 7 8