The Voynich Ninja

Full Version: Elephant in the Room Solution Considerations
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Regarding the text of the Voynich manuscript, I am at this point inclined to believe the following as a tentative point of view, of course subject to further evaluation as new reliable information is learned by scholars about the Voynich manuscript.

Although it may appear still as incompletely interpreted, I find the marginalia on the last page of the manuscript, thanks to the efforts of many including especially Koen G. (even though he’d likely disagree humbly), as being a sort of a Rosetta stone, a tip of an iceberg, revealing that at some point someone treated the manuscript as being, first, readable (so, not a gibberish), second comprised of Latin and some German material. Sometime getting into details leads us to lose sight of an elephant in the room, or at least a reliable part of it.

That marginalia sounds like a reliable steppingstone, for me. Not sufficient, but given both Voynichese, some Latin, and some German, in seemingly meaning ways (for its scriber), were shared in the marginalia, along with some unknown still other words, I think it provides a rather fitting window to the nature of the language being used in the manuscript. 

Even Koen G.’s interview with Dr. Katherine Hindley (You are not allowed to view links. Register or Login to view.) provides some clues as to the extent folks could have gone at that time in “privatizing” their healing or medicinal texts for public protection or privacy reasons, or even for the sake of assurance by an expert supplying the text to be sure he is paid before supplying the key for deciphering a text of medicinal or astrological prescriptions he was  offering (see the interview, around the time 36:40, for interesting points shared by Dr. Hindley).

My reading of the “word” beside the Pleiades image as Botrus is also reliable enough for me, as explained before, not only in terms of its letters (except for that ‘a’ which for me is itself a signifier of possible ciphering effort at work in the VM), but also its correspondence with the image beside it. It also confirms the first observation about regarding the possibility of use of Latin in the text.

I am inclined, though less tentatively, to consider the reading of the significant “daiin” as Batin, borrowed likely from the astrological sources or experts the author was consulting, to be noteworthy and fitting to the nature of the material in the manuscript. It also points to the possibility that even when using Latin abbreviations, words from other languages or local dialects of them may be present in the text, as another playful ciphering and privatizing strategy.

Admittedly, the above are NOT sufficient, but please consider also the following.

In my view, the text is likely a natural language employing short-handed and abbreviation systems similar in principle to the way Latin abbreviations and contractions were used at the time. However, this does not mean it uses entirely standard abbreviation symbols and systems. It may have partially used some symbols (such as 9 for -us or 2 for R, etc.), even then modifying their significations and meanings privately for the author’s purpose. But the author could have invented her own abbreviation systems for reasons of privacy.

She could have also used simple and easily readable ciphering techniques, included omission of redundant words (hence, texts sounding like ‘word salad’), baselining superscripts or rendering them differently as part of baseline letters, substitutions or hiding of letters following consistent triggers (such as the example I have shared for Botrus or Batin when a phonetic ‘t’ may be hidden under an ‘a’ besides a vowel), using multiple languages known to her or as also found in books she read or consultations she made with astrological or medicinal assistants (such as the example of using “Batin” for inner/essence/occultation), reduplications of words at times that can be easily ignored or even using intentionally employed ‘gibberish’ (but only for ciphering reasons) amid valid and meaningful text.

We do not have to abide by an either/or logic to consider whether the text is meaningful or gibberish. Gibberish text, even reduplications, could have been used to distract unfriendly others, and she could have just ignored gibberish material inserted amid valid text for her private use of the book.

I am not inclined to believe that all reduplications are an invalid part of the VM language. They could be chants, repeated words for emphasis, or as made necessary for the topic being discussed. If a few paragraphs are discussing, say, plant roots, or baths, a lot, it would be normal to see an associated word repeated many times. The adjacent reduplications could be resulting from a sentence beginning with a word that the previous sentence ended with. Since we don’t have punctuation, and can’t read the text, we cannot yet assume reduplications as not valid expressions, though some may be employed as a ciphering strategy.

The challenge for reading the text for the experts among you (not including me of course as far as linguistics or statistics go) is considering that privacy-concerned, not publicly intended, nature and usefulness of the handbook for her. If “words” are highly short-handed and abbreviated, some letters standing for words, words for sentences, etc., with ciphering rules applied, then I am not sure how statistical evaluations can be made in a reliable way for deciphering linguistic patterns in the text. We are of course facing a text where some material is not legible, perhaps inked over incorrectly at times, and of course also those many other many pages missing, which may have included keys for reading the text as well. I think the diversity of transcriptions that have become used over the decades have at times caused new problems in the study of VM, almost as if folks have been studying the transcriptions, not the Voynich manuscript itself. Loss of visual data in such studies, in my view, have been detrimental in leading to losing tennis games, so to speak.

Given the long text, it would not be practical to assume any super-complicated and time-consuming ciphering techniques being used in the Voynich manuscript, and the author did not need them, since extremely personalized, idiosyncratic, and non-standard short-hand and abbreviation system could provide her a strong measure of protection for the practical use of the handbook by her. And apparently it has worked for centuries, if that was how she did it.

Moreover, I am not inclined to regard the text is being entirely technical astrological data. Again, there is no need to use an either/or logic here. Some of the text when needed may as well be technical data about astrological information, but this does not mean everything is so.

Just because letters such as o and c are used for technical astrological significations does not mean they can’t be also just simple letters used for rendering meaningful text to her. Having double-c’s with or without diacritics was a common feature of Latin abbreviations used, and if personally re-signified by her to mean something else she intended to say, four c’s following each other, two having diacritics and others not standing for some contractions we don’t know yet, would not be unthinkable.

Also, if the first page of the manuscript is indeed an intended first page, it does not make sense for astrological data being given there, when no such material has yet commenced. Sweeping either/or logic applications may not prove helpful in this case either.

I am at this point non-committal as to the overall language of the text, but based on the last page, partially read, marginalia, as I noted above, and the reading of Botrus, and possibly Batin, I am inclined to believe that mainly (highly short-handed and abbreviated, in an idiosyncratic way) Latin is guiding its writing but with some German and other local dialects or words borrowed from other languages being also present, at least.

I will postpone more judgments about the language of the manuscript to the point where we may be able to narrow down the authorship of the manuscript to specific person(s), hypothetically speaking. I think that may provide us a back-door way of going about narrowing down the language possibilities, even if we still end up not being able deciphered the text.
What was the final proposal for the word ydaraishy (EVA)? 
I lost my bearings with messages as long as novels.
(15-01-2026, 07:39 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.What was the final proposal for the word ydaraishy (EVA)? 
I lost my bearings with messages as long as novels.

[attachment=13463]

@ Ruby Novacna. Sorry again for the confusion, partly caused by the font display I had intended to render, not knowing yet how this forum’s software is used for Voynich font displaying. The image of the original "word" is shown above.

I was NOT offering any interpretive proposal for those example words I used. I was using them to show how the visual features of the words as found in the manuscript get lost in transliterations. So, a double-c with a diacritic is transliterated in a way that the double-c visual feature of it is lost.

In the word ydaraiShy, for example, what I would roughly transliterate as c-^-c (let us say, where there are two c’s connected on top with a dash and having a diacritic on top), is rendered as Sh in Eva. Even in v101, 98ayai29, renders c-^-c as just a number 2.

What I have been surprised to see is that some have read the first c and the diacritic as standing for a letter on its own (which is why in Eva we have ended up with S followed by an h which is not even seen as a c, when in the original it is clearly a double-c with a diacritic on top, something that v101 has obviously and correctly acknowledged, giving the entire expression one transliteration as its table number 2 (You are not allowed to view links. Register or Login to view.), also shown in the image below:

[attachment=13465]

So, I was saying that if visuals are lost in the transliteration, the expression ic-^-c (which is in eva iSh and in v101 i2), if it could hypothetically speaking be a Latin abbreviation found in Cappelli for “Iurisconsultus Collegiatus” (if you enter “icc” in Character Input box in Cappelli online here You are not allowed to view links. Register or Login to view. also shown in image below) would not even be fetchable as such, since we have lost the visuals in the transliteration. “It is lost in transliteration” (to borrow and modify the expression “lost in translation”).

[attachment=13466]

I think the introduction of transliterations for the study of Voynich manuscript, despite other benefits they may had for linguists and statisticians, has not helped us study the VM writings in a way that takes seriously into account their visual characteristics. The studies have been studies of transliterations not of the original text of the manuscript as visually rendered.

I was not in any way suggesting that I think ic-^-c stands for that expansion in Latin. I was using that to illustrate what can be missed when transliterations are used that have lost a sense of the visuals in the original manuscript.
Thank you for your summary. 
Personally, I don't think that transliteration could have prevented the examination of the text.
@ Ruby Novacna. You are welcome. 

I think it does prevent, significantly, when it comes to assumptions made in statistical and linguistic pattern studies, since these "letters" can stand for much more than they appear to be if shorthand abbreviations are used.

Thanks for your question which I hope helped clarify things a bit more.
(15-01-2026, 08:48 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.Thank you for your summary. 
Personally, I don't think that transliteration could have prevented the examination of the text.

@ Ruby Novacna. Could you please explain or clarify, if you wish to do so, why you think such transliterations could not have prevented the examination of the text”?

There are 10674 times (according to the browser) the double c’s have been rendered separately as ch in the browser (which I think is based on EVA). With a gallow in-between, we have 74 of one gallow (cfh), 907 of another gallow (ckh), 217 of another gallow (cph), and 945 of another gallow (cth) in the “benched” renderings. Yes, the transcription has allowed us to have a browser giving us that information, but this says nothing about the accuracy or validity of even such a representation.

[attachment=13500][attachment=13499][attachment=13498][attachment=13497]

This means that, say, 10674 times statistical analyses have missed seeing double c’s which itself can stand for another word or even set of words or phrases (or data). The similarity of the two c’s has been ignored, seeing one as a c and another as an h (as transcriptions, of course, but suggesting they are different letters). Benched gallows have not been treated, possibly, as a diacritic for a double-c.

I cannot agree with those who think these are just benign ways of rendering the Voynich letters in transliteration. Their very visual significance has been gutted out of them, rendering the transcriptions as NOT representing the actual writings of the Voynich manuscript.

I don’t understand why such transcriptions are not taken out of circulation with a warning, or at least acknowledged as being responsible for a lot of slop analyses (AI assisted or not) that use them. These transliterations are at least partly responsible for the failures of the slops that have been sent to the bucket.

Frankly, the transliterations that have not taken account the visual characteristics of the Voynich manuscript should also be sent to the bucket, since they have wasted many people’s times.

Currier is better because it did take into account the visual unity of double c’s (with or without diacritic), and I think is much better than the EVA. EVA is NOT a reliable transcription. V101 is the best of them all, so far. They can be used to perhaps come up with a more visually representing transcription. I would rather use c-c or c-^-c for example, or c->-c, or c-°-c, depending on the kind of diacritics used or not, for instance.

But even these must always be treated with caution, since we are not interested in studying this or that scholar’s transcription system. We are interested in studying the Voynich manuscript’s own writing and the visual features of them are essential for any hope we may have in understanding the text.

So, a 2 in v101 standing for a double c with diacritic cannot be assumed to be just a single sign, but likely an abbreviated contraction for a phrase, itself comprised possibly of one, two, or more words.

How can anyone statistically or linguistically judge a highly short-handed and abbreviated language (if that is the case) and confidently assert that the language is or not a natural language, is or not a hoax, is or not gibberish, is or not this or that language? “2” means nothing since it is just a transcription symbol for something else, an actual piece of the VM “writing.”

I of course appreciate all the efforts that have gone in building such transcription systems. But at some point we need to ask whether the “transcription” method itself a reliable way of going about studying, let alone making judgments, about the nature of the Voynich manuscript.
(16-01-2026, 07:01 PM)MHTamdgidi_(Behrooz) Wrote: You are not allowed to view links. Register or Login to view.I don’t understand why such transcriptions are not taken out of circulation with a warning, or at least acknowledged as being responsible for a lot of slop analyses (AI assisted or not) that use them. These transliterations are at least partly responsible for the failures of the slops that have been sent to the bucket.

Frankly, the transliterations that have not taken account the visual characteristics of the Voynich manuscript should also be sent to the bucket, since they have wasted many people’s times.

I like the existing transliterations and find them highly useful. If you don't like them you can create a better one. What's the problem?
(16-01-2026, 08:08 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I like the existing transliterations and find them highly useful. If you don't like them you can create a better one. What's the problem?

Transcriptions can be helpful for statistical information about the elements of a writing, visually faithful to the original or not. In that sense, I like them too, to the extent that they are not making mistakes in reading the original (I like v101, but not EVA). However, in my view, they cannot be helpful for making linguistic judgments or conclusions about a writing and its meaning.

As I said, I am not interested in transcriptions systems for the purpose of understanding the meaning of the VM text, so I would not waste my time in offering an alternative. I think v101 has done an optimally best job at it, even though the symbols are not always visually faithful. Even a visually more faithful alternative based on it, which can be done using v101, would not solve the problem I was referring to.

The language of the VM cannot be researched meaningfully without a close examination of its visual features.
(16-01-2026, 07:01 PM)MHTamdgidi_(Behrooz) Wrote: You are not allowed to view links. Register or Login to view.Currier is better because it did take into account the visual unity of double c’s (with or without diacritic), and I think is much better than the EVA. EVA is NOT a reliable transcription. V101 is the best of them all, so far. They can be used to perhaps come up with a more visually representing transcription. I would rather use c-c or c-^-c for example, or c->-c, or c-°-c, depending on the kind of diacritics used or not, for instance.

That assessment depends on two specific assumptions: (1) that the Author was the same person as the Scribe, and (2) that the alphabet he designed distinguished glyphs by small changes in their shape.  (And also on other assumptions that I am not allowed to mention here.)

By Author I mean the person who devised the script, chose the contents, and determined the sequence of words on each page.  

By Scribe I mean the person who actually put the ink on the vellum to produce the VMS we have now. 

Without those two assumptions, one must assume that the Scribe would not be able to reproduce the precise shapes of the glyphs, so those fine details are just noise with no significant information about the contents of  the book.  And, moreover, that normal random variations of handwriting can deform glyphs so much that they look like other glyphs. 

But then "splitter" encodings that take into account small changes in glyph shape, like VT101, are bad encodings, because they require the transcriber to record that noise and make more arbitrary decisions.  With EVA the transcriber often has to arbitrarily decide whether an ambiguous glyph is an r or an s. With a splitter encoding he must make several such decisions on every glyph, about the length and shape and angle of body and plume and tail... And such encodings are arbitrary anyway: why distinguish only six types of diacritics on a Ch, and not seventeen?

A splitter encoding can also make analysis more cumbersome and less revealing. With EVA, the most common word in the book is daiin, and we can compare its frequency from section to section.  With a splitter encoding that word may become five or six different words, and mere handwriting variations may radically change their frequencies from section to section.

If you do believe in (1) and (2) above, a "splitter" encoding like VT101 can make sense.  If you don't believe (1) and (2), EVA is much better.

In fact, EVA may be too "splitting" itself.  For some analyses, it may be a good idea to first map a and o to the same letter, and r and s, and maybe even Ch and ee.  That may lose some information, but will reduce the noise that comes from handwriting variation and transcriber errors.

As an analogy, consider doing linguistic analysis of English text.  It is usually better to first map the whole text to lowercase and remove any italics and boldface markup.  This pre-processing loses some significant information, such as the difference between the proper name "White" and the common word "white", and the relative emphasis of some words.  But it also removes the noise that comes from capitalization of the first word of sentences; so that, in the sentence"The man sees the dog", words 1 and 4 can be recognized as being the same word.

All the best, --stolfi
Well, the existence of transliterations does not prevent anyone from looking at the original writing.

Obviously, there is a limit to how much you can learn from doing that.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18