The Voynich Ninja

Full Version: devil's advocate: the case for glossolalia
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hello everyone,
I hate to start posting here on such a skeptical note, but I am hopeful that if we can refute the skepticism, perhaps it will give us new leads to investigate the actual language and meaning of the text, if any.
I want to state first and upfront that *I hope my following arguments are wrong*. I want this text to be meaningful, I want us to decipher it, I want us to be able to read it and understand it.
A bit about my experience working on this ms text: It would remind one of the Churchill quote about success being the ability to go from failure to failure without losing one's sense of enthusiasm, or the Niels Bohr quote about an expert being somebody who has already made every mistake it is possible to make in a given field. I feel that I am somewhere in that process, just without the success or expertise at the end.
Over the past several years, I have attempted to decipher the Voynich ms text as the following languages: Coptic, Old Irish, Old Norse, Elfdalian, Gutamal, Orsamal, Old Gutnish, Finnish, Old Prussian, Middle High German, Old Albanian, Syriac/Aramaic, and Hebrew. My favorite response from a scholar was the laconic, "The identification of the Voynich manuscript as an Orsamal document would be a truly revolutionary discovery." Classic. (For the record, I was trying to link the Voynich character inventory to a runic alphabet.)
Here is my conclusion, and my challenge to any other would-be Michael Ventris who wants to succeed in deciphering this ms text:
It is not enough to propose deciphering of individual isolated words and names and labels. It is not difficult to do that with a handful of isolated words and letter values for the characters in them. The problem is, if you then continue with the rest of the characters, and the letters in the alphabet of the target language, you will quickly run out of characters, and many consonants in your target language will go entirely unrepresented in the Voynich ms character text.
As a test, try expressing *one whole paragraph* in the text - any paragraph you like! - with your deciphering system. You will likely find that it comes out far too repetitive, with far too few letters repeated far too frequently, and far too many other letters missing entirely. It will not look like actual writing in an actual language at all.
Thus, I cannot take individual letter and word readings seriously, until I have seen what an entire paragraph looks like with the method. Based on my own experience, I can tell you that it probably won't look good.
The only way around this is to introduce ambiguity into one's system, by making one Voynich character represent multiple consonants of the target language, or by presuming substantial misspellings in the target language whereby the author used one letter to represent other similar letters, which amounts to the same thing. Then one has a different problem: if every character can be read as three different letters, then every 4-letter word can be read as 81 different words! It is hard to read such a text, and hard to have any confidence about any one particular reading of any word.
This is why, if the text does represent any language at all, I think it must be a vowelless abjad. There just aren't enough characters, nor enough variety of character combinations, for the Voynich inventory to incorporate both all the vowels and all the consonants of any language. Surely it was not an accident that John Stojko's purported deciphering into Ukrainian, had to make it a *vowelless* Ukrainian. And I can tell you, from experience again, that even with a vowelless abjad theory, it is *still* difficult to represent all consonants without introducing ambiguity into one's deciphering system!
I recently read the thread about "what the heck is an 'otaly'?" and the comments of several experienced researchers about the Voynich ms text as a "defective script". For the classic example of such a thing, look up "Book Pahlavi". Have fun trying to read anything in that script! Here's the problem: without knowing the historical context of the writings and texts and manuscripts and all of the historical cultural heritage surrounding them, *the Book Pahlavi texts would in fact be indistinguishable from random meaningless character strings*. In late Book Pahlavi, the "words" have effectively become the equivalent of logograms: you can only read them if you already know what the whole words are. Sounding out meaning from the letters is next to impossible. *If Book Pahlavi were unknown, and we only found one 200-page text of it from the medieval period, we would never be able to decipher it or read it.* No way. Even if we had full knowledge of the Middle Persian languages and all the Aramaic historical dialects that influenced it.
By the way, I think the question about "otaly" is also a very good point. It makes me question the significance of all of the labels throughout the ms, which is unfortunate. I followed up on the lead and took a look at all the pages where the "otaly" label appears. I look at the top row of labels on f88r, and see a group of small words that are far too short, and far too similar to each other, to represent distinct names or identifications of distinct plants or roots: "oral", "oraly", "oldar", "otoky", "otaly". This is not a set of plant names or parts; this is an elementary grammar book exercise of words that begin with "o", with very small and slight variation in the order of letters that follow it. As meaningful text labeling plants or parts or roots, it is hardly a plausible set of words at all; but *as glossolalia, it is a perfectly logical sequence*.
Likewise with the top row of labels on f99v: "otoldy", "otor(chy)", "oldy", "dar(ary)", "otaly", "olsy", "arol", "otoky". Very slightly more variation than f88r, but not much. The optimist in me wants to find similarities in the plants next to the two "otaly"s, and the two "otoky"s, and so on. The skeptic in me looks at the whole two rows of words and thinks, "These are just strings of similar syllables with slight variations."
=====
So there you have it: my arguments against the possibility of any plausible convincing interpretation of this text as actual language. Once again, I hope I am wrong. I have spent substantial effort trying to decipher this thing. I would love to be proven wrong, preferably by myself Smile  Seriously, I would be very impressed and pleased if anyone produces a convincing deciphering that attains the support of reputable professional scholars of the given language. But I am skeptical, based on my own experience.
Consider on the other hand the following descriptions of the Voynich ms character text:
"[It] consists of using a certain number of consonants and vowels, in a limited number of syllables that in turn are organized into larger units that are taken apart and rearranged pseudogrammatically, with variations";
"[It] consists of strings of syllables, put together more or less haphazardly but emerging nevertheless as word-like and sentence-like units because of realistic, language-like [structure]".
This sounds like a more or less accurate description of the Voynich ms character text, does it not? I think anyone who has spent a substantial amount of effort researching this text will understand what I mean.
Alas, however, the above quotes are actually not descriptions of the Voynich ms: They are the linguist William Samarin's descriptions of Pentecostal spoken glossolalia in his landmark 1972 book on the subject. (Simply look up the Wikipedia page for "Glossolalia" to find all these quotes.)
Samarin concluded that this glossolalia is "only a facade of language", that it is "meaningless but phonologically structured human utterance, believed by the speaker to be a real language but bearing no systematic resemblance to any natural language, living or dead." He argues that the syllables are not organized into words, and that "it is neither internally organized nor systematically related to the world man perceives."
I repeat (qokeedy qokeedy qokeedy): I hope I am wrong. This is a depressing and disappointing argument I am making and conclusion I am suggesting. But we have to be honest with ourselves and compare the evidence we have for any meaningful language hypothesis of the Voynich ms, vs. the strength of the above glossolalic description of the Voynich ms.
=====
Now just in case I am wrong, as I hope I am, here are my few thoughts about the language of the text Smile
Like I said above, if it *is* any language, I strongly believe a vowelless abjad script makes more sense than an alphabet with vowels. I repeat: surely it was not an accident that Stojko's Ukrainian "deciphering" was a *vowelless* Ukrainian. Again, if you disagree, please produce a complete correspondence key of all Voynich characters and all letters of any alphabet with vowels, and we'll see how any paragraph of the text comes out.
Recently I had liked my Aramaic hypothesis a lot. But I came to find that my transcription had too many ambiguities, leading to the problems that I describe above.
Hebrew is much more plausible, to be honest. There were substantial Jewish communities living throughout many parts of Europe in the early 15th century, including in northern Italy and nearby areas of southern central Europe. They did not speak Hebrew as an everyday colloquial language, but they read and wrote Hebrew quite regularly and well, and not just as a liturgical language either. There is a substantial variety of literature written in medieval Hebrew in Europe. D.N. O'Donovan's "Voynich Revisionist" website had an interesting recent article about possible connections to Kabbalah in the Voynich ms and Panofsky's old comments about the topic. I think all of this makes a certain amount of sense. A few years ago The Guardian published an article about Stephen Skinner's view that a Jewish physician in 15th c. Italy wrote the ms, based purely on all of the illustrations.
But I have found that my hypothesis still runs into plenty of problems as soon as I start to try to decipher the text of actual sentences and paragraphs. (Again, the Pleiades and Zodiac labels and other labels are nice to generate hypotheses for letter values of characters. But the proof of the pudding is in the paragraphs.) Here's one idea: rather than each character being a letter, perhaps each pair of characters is one letter. Now you would think, with 15-25 Voynich characters, that you would get an inventory of many hundreds of character pairs or bigrams. But not so! The character text of this ms is so repetitive, with so little and narrow variation, that if you divide the words into the most natural and common bigrams, of course allowing for the odd final "-y" or initial "d-" or "q-" or medial "e" or "i" to occur by itself and not as part of a bigram, then amazingly you only find about *20* or so, yes only TWENTY or so, *bigrams* that occur with any substantial frequency! In fact, as I have tried to pair Voynich *bigrams* with Hebrew *letters*, amazingly I find that I do not run out of Hebrew letters to correspond to the bigrams, I *run out of Voynich bigrams* to correspond to the Hebrew *letters*! Yes you read that right, this ms does not even contain enough frequent *bigrams* to represent a complete abjad without any vowels.
Nevertheless, such a bigram inventory seems to come far *closer* to being capable of representing a language's full abjad or alphabet, than any single character inventory theory that I've ever tried or ever seen. With bigrams, I have issues with a couple or few letters of an abjad. Whereas with single characters and an alphabet, one has issues with substantial portions of any language's consonant inventory.
Of course bigrams have their problems too. The "words" are only half as long as they appear to be. So in this case it really must be a vowelless abjad, as with vowels written no language's words could be this short, even if we take the Voynich words as syllables and take the liberty of joining two of them to make an actual word. For example, returning to "otaly" again, which really is an excellent test case to bring up in many ways, with my bigram theory we only have two letters here plus a probably low-information generic ubiquitous single character ending "-y". And this 2-3 letter word has to represent both plant/root labels, as well as the nymphs in four of the Zodiac sign diagrams. Now it could just be a day number or name that recurs in multiple months (like "15th" or "Ides"), which happens to be a homonym with a plant/root name that appears on two different pharmacological pages. Indeed there are a *lot* of homonyms in an abjad when you don't mark the vowel diacritics. Still, I admit there's not a lot of information in such a word if it is composed of bigrams.
=====
In sum, I think the Hebrew vowelless abjad bigram hypothesis I just presented above is as good as any other hypothesis that's out there, if not better. But I cannot honestly say that it is more convincing than the glossolalia hypothesis!
The poker pro Mike Caro told a funny story about a student of his, a middle-aged man who was a really bad poker player. Caro recounts that the man used to lose $25,000 a year playing poker. With the help of Caro's poker lessons, the man improved so much that he only lost $5,000 a year playing poker. But Caro had to admit, the man's wife had an even better financial strategy for him: quit playing poker! Caro could help him a lot, but not enough to be better than quitting.
I get the feeling that my hypothesis, and all of our best hypotheses, are like Caro's poker lessons, and the Voynich ms is the unfortunate man: our best theories can seem to have the potential to reduce the level of opacity of the text significantly. But the glossolalia / meaningless nonsense hypothesis is like the wife's advice: it may not be fun or interesting, but it's probably better than anything we've been able to come up with so far.
I have no desire to quit trying to decipher the Voynich manuscript. But I try to keep in the front of my mind the realistic probability that it may well just all be elegantly written meaningless nonsense.
I sure do hope that I myself and others here can refute my arguments and prove myself wrong!
-Geoffrey Caveney
Hi Geoffrey,

I feel your pain, even though i have not looked into the text myself yet, i have watched a bit from the sidelines, but am trying to bone up on some of the particulars to see if it could have anything to do with a hypothesis i am working with, which is that it doesn't encode text in the usual semantic sense, but rather names of places, distances and bearings. If it did, it could be an encoded rutter from which a portolan map could possibly be charted. A lot of names on maps are abbreviated, probably because they were abbreviated in the original nautical notes from which the info originated, and thus abbreviation signs may be involved. 

So for instance, those short repeated vords could represent wind direction abbreviations, which would be constantly repeated. There would be possibly 32 combinations thereof. Distances would likely be numbers followed by an abbreviated measurement, for words like leagues, for example, again repeated for almost every entry. 

Might also be things like the height of the pole star for that location, or other such time and direction keeping devices, such as zodiac names or abbreviations thereof set in a certain sequence to obtain sun and moon charts, or a Solomon table for new moon listings made up of various letters and numbers.

But i can foresee problems with this idea. Like you say, similar labels for different things is an issue to resolve. But the o words could have meaning in terms of spatial relationships.

It might be possible that the vords are based on a volvelle or astrolabe for calculating such timings and distances but dont know how it would pertain to place name abbreviations or indications thereof.

From what i see in the images i sometimes get the idea that the text is not necessary, other than perhaps to fill in details. But it would not explain text only pages, so i do feel it means something. I hope it does, at least. I would think that because the imagery could be seen to have meaning to those who have prior knowledge that the text could be the same, ie they could read off it directly, rather than having to look up each glyph in deciphering, and have to transfer to another document bit by bit to read it. It just seems that way to me, if it is a personal handbook of some sort, that the owner would be able to reference the data without too much trouble, otherwise it seems like it would be a lot of work to go through only to be harder to decipher than it was to put together in the first place. If it records shoreline data, then it could be seen why they might want to encode it, to hide it, and because it is not necessary to see such detail until you are travelling to those specific places. Could maybe make a map on the fly when needed, if they could read their own shorthand. Or they might not have cared about the names at all and just listed each port location by distance and direction from the last one, with the odd famous or well known name thrown in for clarity every once in awhile, or a blurb here and there like good water here, or don't hit that rock.

It could also be that this info would be copied from various originals and therefore the layout may change from source to source, and also the language, abbreviations, etc. could change from section to section.

Given this idea can you see any possibilities within your experience of trying to decipher the text? Did any patterns emerge that could fit this hypothesis?
Frustration comes from attempting to understand the text as "Coptic, Old Irish, Old Norse, Elfdalian, Gutamal, Orsamal, Old Gutnish, Finnish, Old Prussian, Middle High German, Old Albanian, Syriac/Aramaic, and Hebrew". Progress comes from approaching the text with the goal of understanding the text as itself. It isn't useful to say that the text doesn't work like Latin or German or whatever. It is useful to describe how the text works in its own terms.

Linguistic research is not about guessing what language the text might be and giving it a go. It's about observing the patterns and peculiarities and appreciating that they are the imprint of a natural system. It's not about knowing there's a language out there which might "fit" the text, but having a model on which to arrange the evidence and produce ideas for further research.

I think you should quit trying to decipher the Voynich manuscript. I also think you should start trying to understand it.
Aww man, I guess this means you won't believe my latest theory identifying <okorory> as the Hebrew word for "banana" when I read it in bigrams as a vowelless abjad...I was hoping to post about it around the beginning of next month Smile

I agree that it is quite worthwhile to investigate the linguistic structure of the text in its own right, as your blog does. I hope to learn a lot from reading more of your material there.

But I don't think that it's useless or worthless to consider and test hypotheses about possible underlying languages of the ms text. In the classic case of Linear B, yes, Ventris and others studied the structure of the Linear B tablets and language in its own right, as itself, just as your blog does. Clearly this work was essential. But Ventris also entertained and considered and tested hypotheses about possible languages. At the time, the prevailing consensus of researchers was that the text of the Linear B tablets could *not* represent any form of Greek! Ventris actually agreed with this, and he began the most critical stage of his deciphering with what he first thought was a routine check to confirm that certain affixes could *not* be Greek. Only when these affixes, to his own surprise, unexpectedly corresponded closely with Greek, did he find the scent of the trail that led him and Chadwick to decipher Linear B as the archaic form of Greek that we now know as Mycenaean Greek.

So I believe that the discussion of specific language hypotheses, and arguments for and against them, may likewise lead to progress on the Voynich ms. Of course all attempts will fail until the one that succeeds, if it ever does.

Of course I did not consider all of the language theories in my long list with equal seriousness. I was poking fun at myself by including the ones like Elfdalian, Gutamal, Orsamal, etc., that's all. I hope we are all able to poke fun at ourselves, and criticize ourselves, first and foremost in our critical analysis of this fascinating but very difficult Voynich ms text.
The point about Ventris trying Greek out on Linear B is fair. I only wish we knew as much about the Voynich text as he did about Linear B. He and other researchers (principally Kober) had gleaned so much information about words and the script that it was obvious whether an attempted solution was good or bad.

We're lacking so much of that knowledge with the Voynich text. Especially as the word structure suggests that even a "good" solution might look pretty weird. (I suppose Mycenaean Greek looked pretty weird too at first.) It wouldn't surprise me if we found a solution which could read part of the text convincingly (labels, for example) but still left other parts of the text highly obscure.

For me, at least right now, a "bad" solution is not strictly one with a bad output but with a poorly reasoned method. Well reasoned gibberish is more appealing than obscurantist Shakespeare. We will learn something from the former; the latter only proves a distraction.
Good points Emma. Wow, "obscurantist Shakespeare". I suppose this means you wouldn't be too impressed by my erstwhile efforts, unfortunately based on overly ambiguous possible letter values for characters, to decipher a paragraph of the botanical section text as an Aramaic translation of an Aristotle quote on plants?  Dodgy <Sigh> this is what happens when [sh] can be taken as a ligature of two characters, in either order, each of which could represent any of 3 or 4 related letters. I assure you it was a most poetic translation. Perhaps more poetic than Aramaic, but that is another issue. I won't even mention the bit about Pseudo-Dionysius the Areopagite... By contrast, my more recent efforts with Hebrew bigrams produce worse output, but with a methodologically much stricter method. 

Yes, realistically the knowledge of scholars and researchers about Linear B vastly exceeds our knowledge of the Voynich ms. As I understand it, the biggest difference is that Linear B was more or less governmental record-keeping, the equivalent of census records, inventories, and the like. Even before they came close to deciphering it, in some sense they were still able to relate it to real-world material content. They were even able to distinguish masculine and feminine suffixes based on words next to depictions of men and women! Later they deciphered numbers in a similar manner. The point is, the people who *wrote* the Linear B tablets were trying to be clear rather than obscure, for their own record-keeping purposes. 

We cannot necessarily say the same about the person or people who wrote the Voynich manuscript. Most likely they wanted the text to be obscure to the uninitiated, at the very least. Well, they certainly did a very good job of achieving that! How much actual meaning the text could possibly have had for the author(s), or for the initiated, and how many clues they may have left us with all of the illustrations and some of the labels...well, it is just tantalizingly enough to perhaps leave us with the hope of making sense of some of it...perhaps.
Up until very recently I was convinced the text must have some meaning. My argument would focus on the sheer quantity of it, the extra work of adding  yet another text-only section at the end, the layout indications that the text belongs with the imagery.

I'm starting to believe, though, that we don't really have any strong arguments for either option, just a whole bunch of questions we're unable to answer.
In order to address the case for glossolalia, I think it's important to consider basic statistical features.
E.g.:
Does glossolalia follow Zipf's law?
Is the exact consecutive repetition of the same word frequent?

If "the syllables are not organized into words" I guess that the answer to both questions is "no", but examining actual corpora would be more informative.
I think the answers to these questions depend upon how you define "words". When the linguistic analyst stated that "the syllables are not organized into words", that simply means that there are no clear distinct units larger than the syllables. But that does not mean that one cannot consider and analyze arbitrary strings of two or more syllables in glossolalic speech.
For example, if a glossolalia speaker uttered the syllables "...arlamaynunabalinabalinabalikamarila..." in the course of their speech, linguistic analysis might not be able to determine any particular organization of this speech into definable words. But we can still identify the consecutive repetition of syllables "...nabalinabalinabali..." within the speech. We don't have to call "nabali" a "word", we could call it a vord, or just a sequence of three syllables. But whatever we call it, it is repeated three consecutive times in this speech. (To be clear, I just invented this example.) 
My instinct is that glossolalia is far more likely to be more repetitive than natural language. We would have to look up the details in the Samarin book to confirm or refute this.
If a person wrote in glossolalia, they would most probably try to indicate some kind of word breaks in their writing, because glossolalia speakers believe that they are speaking an actual language.
Regarding Zipf's law, we need to consider studies such as Wentian Li (1992) "Random Texts Exhibit Zipf's-Law-Like Word Frequency Distribution" (reference 11 in the Wikipedia article on "Zipf's law"). Quoting the article, "in a document in which each character has been chosen randomly from a uniform distribution of all letters (plus a space character), the 'words' follow the general trend of Zipf's law." 
Thus, while it may be useful in some linguistic analysis of a known natural language text to consider the likelihood of the Zipfian distribution of the words in it, the converse may in fact not be true: The mere existence of a Zipf's law distribution of the words in a text of unknown character may actually not necessarily be indicative of an underlying natural language in the text.
Hi geoffreycaveney (and a belated welcome to the forum!),

Of course, I would like the Voynich text to have a meaning which could be recovered. I accept the idea that it may just be gibberish, although I hope it isn't.

I hope I don't sound like I'm splitting hairs here, but as far as I know "glossolalia" is a relatively recent phenomenon: it seems to be a product of German and English prayer groups in the mid-1800's, popularized by Pentecostal churches in the early 1900's. In fact, the term "glossolalia" doesn't appear in any dictionaries before 1879 (You are not allowed to view links. Register or Login to view.). As a result, I am reluctant to apply this term to the Voynich text.

For the middle ages, there are certainly instances of invented languages: Hildegarde von Bingen's Lingua Ignota is perhaps the most famous example. However, even the Lingua Ignota is somewhat rooted in latin, and certainly the sentences are structured in ways that are inspired by, and often mixed with, proper latin.
Strings like your "...arlamaynunabalinabalinabalikamarila..." are characteristic of modern christian glossolalia and are very different from the structured Lingua Ignota.

Although modern glossolalia is often described as synonymous with "speaking in tongues", earlier understandings of speaking in tongues were rather different: in the Bible, speaking in tongues means that the apostles were able to be understood by foreigners, like an instant translation. "Speaking in tongues" made language clearer to their audience, not more obscure.
Admittedly, I am not an expert in the history of linguistics, but I think it is always important to keep in mind the ways in which things were understood at the time the Voynich was written.
Pages: 1 2