I finally got around to reading the entirety of Torsten Timm's paper, the follow up work he has published as a companion / continuation of his hypothesis, and this thread. I have also read the blog of Brian Cham and the posts on this forum of Stephen Carlson, who last I've heard are supportive of it. I've also taken special note of Koen's statistical analyses that include control groups of known language texts, a transcript of the VMS, and an output of TT's algorithm, all controlled for text length. It's these graphs which are perhaps the most striking of all, as most appear to show the VMS and TT's gibberish clustering together in their statistical behavior, away from a different cluster that includes real language samples. How to best interpret Koen's data is a matter of some debate, but to my untrained eye, it almost seems like TT's bot behaves "even more VMS-y than the VMS"; the statistical anomalies include most of those seen for the VMS, but even more anomalous in that same way. If this indeed a valid trend correctly interpreted, I would say this lends support to TT's theory. Does it prove it beyond a reasonable doubt? Definitely not. But it is evidence in his favor.
I think the best challenge to Koen's data would be to find specimens of symbolic language known to have meaning, contemporaneous to the VMS, and similar in text length, which cluster near the VMS (and TT's bot) when subjected to the same statistical analyses. Methinks this could be a difficult hunt.
On the other hand, I'm sure this isn't the last algorithm for reverse-engineering VMSish nonsense we'll see; I'm currently working on one now, involving dice. I'll happily hang it up here for target practice when I finish building it, and would be happy to see its output subjected to the same statistical analyses that Koen and Julian Bunn have used.
There are a number of relatively recent works composed of English-ish sounding gibberish, good enough to fool someone who doesn't speak English into thinking they're hearing real English. The Italian song You are not allowed to view links.
Register or
Login to view. and the short film You are not allowed to view links.
Register or
Login to view. are the best known examples. Both of these were entirely scripted, and the scripts could easily be compared statistically to similar-length text of real English. The results would be hard to interpret as significant when the specimens are so short. Still, if the creators of either work or any similar one were to use the same algorithm to create a much longer text, it might be worth seeing how its statistical properties differ from that of a work in real English.
Is there any precedent, from any time period prior to the advent of computers, of anyone producing voluminous amounts of highly structured asemic writing? The only one I can think of is You are not allowed to view links.
Register or
Login to view.; statistical analyses You are not allowed to view links.
Register or
Login to view. on Hamptonese, and it's not entirely clear that it's meaningless or asemic. If any good examples of human-generated large-volume data-mineable meaningless text are available comparison, it might be worth asking
how and
why such works were created. If a method comparable in technique to TT's autocopy algorithm was involved, that certainly lends support to him.
There is a very much a limit to the explanatory power of TT's autocopy hypothesis, and I think Mr. Timm admits this quite readily. If no example of a similar algorithm producing a similar product can be found, that doesn't disprove it. But raises the number and size of assumptions required to accept it, which is not good news either. If we accept that the VMS is utterly unique in its execution, that's a lot to explain, especially in a time and place where unique creations and new and unique ways of doing things were not the cultural norm.
I hear Mr Timm also admitting repeatedly another major limitation to his study: his text was composed by a bot, the VMS by a human. When attempting to run a simple algorithm repeatedly, a human mind gets bored and — consciously and unconsciously — introduces levels of complexity and subtle patterns in the "random" output. A bot doesn't do this, and can't [yet] model this property of the human mind with any accuracy. Is this enough to explain the subtle clustering of vords and vord-pieces by page and apparent topic? Maybe. Or maybe not.
If there were grant money available for anything VMS related (ha!), it would be interesting to pay some teams of starving university students who'd never heard of the VMS to receive a facsimile copy of the VMS with the text removed, and actually run TT's algorithm by hand (with a real quill and iron gall ink) until the facsimile is filled with text. I would keep metrics on how much time a single scribe could typically put into this effort in one sitting before becoming too tired or bored to continue. I'd want to analyze the output of each writing session, to see if certain statistical properties typified the text written near the start, versus text composed near quitting time. Then look at the real VMS: Are there signs of a similar statistical shift every X pages or paragraphs or so, of the scribe struggling with the algorithm and wanting to just get the stupid thing over with for the day? That would definitely lend support to TT's idea.
I don't think TT's conclusions to date are the last word on the VMS, and I'm still going to chase the possibility that there is meaning locked in there. But I give him credit: he has accomplished the objective he set for himself, which was simply to show that generating a large amount of nonsense text that looks and feels like real language could have been possible at the time the VMS was created. Mr. Timm's own beliefs (that the VMS's text is indeed meaningless) need to be separated from the very parsimonious conclusion of his scholarly work. Because he has not shown that such an event happened or was even likely to have happened, and never claimed he did.