The Voynich Ninja

Full Version: The Linguistics of the Voynich Manuscript (Bowern et al. 2020)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
(07-09-2020, 06:40 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.I keep hearing rumors, both here and on Nick Pelling's blog, that someone has attempted to execute Torsten Timm's self-citation method using low-tech methods, and failed to replicate his results. 

I'm unaware of any such rumour appearing on my blog.
So, having read to the end, I should slightly revise my impressions. It's not clear until the summary that they favour a "cipher" solution rather than a simple "language" solution. I think this is fair, though it seems that the conclusion rests mostly on the entropy problem.
(07-09-2020, 10:04 PM)nickpelling Wrote: You are not allowed to view links. Register or Login to view.
(07-09-2020, 06:40 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.I keep hearing rumors, both here and on Nick Pelling's blog, that someone has attempted to execute Torsten Timm's self-citation method using low-tech methods, and failed to replicate his results. 

I'm unaware of any such rumour appearing on my blog.

I'm obviously remembering it wrong then.

Torsten Timm's self-citation method is very testable, and I swear I've heard mention of people trying it. If I could afford to, I'd fund a trial of it on the cheap, because I'm very curious to see just how easy or difficult it is to execute using nothing more than a brain and a writing utensil, just how much the output resembles the VMs statistically, and how much both of these factors vary from person to person.

I think I'm going to do an academic literature dive sometime soon looking for studies of asemic writing and pseudolanguage both spoken and written, from an information science or neuropsych perspective. I'm hoping to find that someone at some point has paid a small group of people to generate a large amount of written (or audio-recorded and transcribed oral) pseudolanguage, analyzed the output, and published it in a journal. If the raw data these kinds of experiments generated tended to deviate statistically from comparable-length specimens of real language in the same general ways that Voynichese does, I think that would bolster Timm and Schinner's ideas a great deal. If on the other hand this was not observed, that doesn't rule out the VMs being stochastically generated pseudolanguage. But it doesn't support it.
Since this is a new article, I moved the thread to News
(07-09-2020, 08:18 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.The dismissal of the text as fake or gibberish is good and solid. They used the comparisons of Zipf's law, proportional frequency, and MATTR to show that the Voynich text is broadly similar to natural language. While these measures are measuring related things, all three would be hard to fake.

Quote:As with the Zipfian word distribution, we find Voynich to be well within the expected values for natural language texts, and far from random gibberish. If the Voynich text is meaningless, its creators mimicked natural language in a sophisticated way.

I don't find it credible that a hoaxer before 1910 could have achieved this either by design or by luck.

I generated some You are not allowed to view links. Register or Login to view. which You are not allowed to view links. Register or Login to view..   (So did Gordon Rugg and Torsten Timm, by different methods.)

I was aware of Zipf's Law, and that it was followed by the Voynich Manuscript, at the time I generated my fake Voynich Manuscript, but following Zipf's Law was an accidental output, not an input.   All my method required was a set of tables for choosing the current glyph given the previous glyph, and a method for generating random numbers.   My method wasn't quite correct but it was enough to follow Zipf's Law, and that wasn't intentional (though I would have rejected it and tried something else if it didn't).

The question we should really be asking is, "How likely is it that someone in the early 15th Century would use a method which resulted in text which followed Zipf's Law?"   The constraint they would have been working to is that it had to look like a natural language.   The bare minimum for that is that the choice of current glyphs depends on the previous one, and I think that would have been obvious even at the time the Voynich Manuscript was written.
Quote:n essence, because gibberish is by nature random, it should not display any of the higher level organizational properties that The Voynich Manuscript displays (as summarized here in §3.3 and §4). The Voynich Manuscript is highly unusual and non-language like at the character level. For measures that look above the word to line and paragraph, as well as in the distribution of words across the manuscript, it looks like a natural language. This strongly implies that the manuscript is encoded natural language rather than gibberish, since the measures used to track the paragraph structure are very unlikely to be directly manipulated and so are a good indicator of real structure.


This is a weak point, and I'd be careful with the word "random".  A natural language's flow is also "by nature random". On the other hand, I can't see why gibberish by nature random cannot exhibit organizational properties say, on the paragraph level. Imagine you restart your pseudorandom gibberish generator with each new paragraph.

It's not organizational properties lying on the surface that threaten the gibberish theory, but rather indirect indicators, such as when I find two most common Voynich star labels in the same paragraph on the title page, I believe that's a really curious coincidence for gibberish.

I haven't yet read the article farther than that.

The Zipf law conformance does NOT tell anything about whether the sequence is natural language or not. I think this has been discussed more than once in the forum.
(07-09-2020, 11:47 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.I generated some You are not allowed to view links. Register or Login to view. which You are not allowed to view links. Register or Login to view..   (So did Gordon Rugg and Torsten Timm, by different methods.)

I was aware of Zipf's Law, and that it was followed by the Voynich Manuscript, at the time I generated my fake Voynich Manuscript, but following Zipf's Law was an accidental output, not an input.   All my method required was a set of tables for choosing the current glyph given the previous glyph, and a method for generating random numbers.   My method wasn't quite correct but it was enough to follow Zipf's Law, and that wasn't intentional (though I would have rejected it and tried something else if it didn't).
Right, there is an article by Wentian Li, You are not allowed to view links. Register or Login to view. (1992), that makes a similar claim. But the literature is not unanimous; for example, this article by Ramon Ferrer-i-Cancho and [font=Tahoma, Verdana, Arial, sans-serif]Brita Elvevåg, [/font]You are not allowed to view links. Register or Login to view. (2010), argues the opposite.

I don't really have to the time to referee this dispute, but as fair as I can tell there are plenty of cases were non-linguistic processes can generate Zipfian distributions. The upshot is, to use the terminology of biostatistics, the observation of a Zipfian distribution is sensitive to, but not specific of, a linguistic generating process. In other words, while the false negatives may be low, the false positives could be high.

ETA: I should I add that while I found these articles by a quick Google search, they are also cited in the Schinner & Timm paper.
TT transliteration of the VMS fails FIPS PUB-140-1 Test battery in CrypTool. The VMS is certainly not pure random generated text.

This paper reviews evidence about Zipfs law..it's a banger...also got links to other papers about random text methods.
  "Zipf’s word frequency law in natural language: A critical review and future directions"  (S.Piantadosi 2015)

You are not allowed to view links. Register or Login to view.
(08-09-2020, 12:43 AM)Stephen Carlson Wrote: You are not allowed to view links. Register or Login to view.
(07-09-2020, 11:47 PM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.I generated some You are not allowed to view links. Register or Login to view. which You are not allowed to view links. Register or Login to view..   (So did Gordon Rugg and Torsten Timm, by different methods.)

I was aware of Zipf's Law, and that it was followed by the Voynich Manuscript, at the time I generated my fake Voynich Manuscript, but following Zipf's Law was an accidental output, not an input.   All my method required was a set of tables for choosing the current glyph given the previous glyph, and a method for generating random numbers.   My method wasn't quite correct but it was enough to follow Zipf's Law, and that wasn't intentional (though I would have rejected it and tried something else if it didn't).
Right, there is an article by Wentian Li, You are not allowed to view links. Register or Login to view. (1992), that makes a similar claim. But the literature is not unanimous; for example, this article by Ramon Ferrer-i-Cancho and [font=Tahoma, Verdana, Arial, sans-serif]Brita Elvevåg, [/font]You are not allowed to view links. Register or Login to view. (2010), argues the opposite.

I don't really have to the time to referee this dispute, but as fair as I can tell there are plenty of cases were non-linguistic processes can generate Zipfian distributions. The upshot is, to use the terminology of biostatistics, the observation of a Zipfian distribution is sensitive to, but not specific of, a linguistic generating process. In other words, while the false negatives may be low, the false positives could be high.

ETA: I should I add that while I found these articles by a quick Google search, they are also cited in the Schinner & Timm paper.

There's an important difference.  In my method, the probability of the current glyph depends on the previous glyph, so while the input is random, the output isn't completely random.  I think all alphabetic languages have that property and suspect it's enough to make the distribution Zipfian.  It would be nice to have a mathematical proof though, or failing that, an experimental verification.
(07-09-2020, 11:22 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.
(07-09-2020, 10:04 PM)nickpelling Wrote: You are not allowed to view links. Register or Login to view.
(07-09-2020, 06:40 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.I keep hearing rumors, both here and on Nick Pelling's blog, that someone has attempted to execute Torsten Timm's self-citation method using low-tech methods, and failed to replicate his results. 

I'm unaware of any such rumour appearing on my blog.

I'm obviously remembering it wrong then.

I remember it as well, but I think it was Lisa Fagen Davis and she mentioned it in association with her work with Claire Bowern.  Maybe in her interview by Koen for the board?

I, too, wish there were a few more details in the paper about that work.  I'm looking forward to reading about what you can dig up in your research on the topic.
Pages: 1 2 3 4 5