The Voynich Ninja
Statistical comparision of Voynich transliteration with common alphabets - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Statistical comparision of Voynich transliteration with common alphabets (/thread-3412.html)



Statistical comparision of Voynich transliteration with common alphabets - davidjackson - 12-11-2020

This is something that has worried me for years: why do people seem to assume that Voynichese glyphs can be mapped to common European alphabets and any extraneous glyphs that don't fit can be dropped?
Here's an example from An Application of Data Mining And Frequency Analyses to Determine Source Languages of the Voynich Manuscript (You are not allowed to view links. Register or Login to view.):

Quote:Frequencies of letters in the manuscript and for a test language were imported to an Excel spreadsheet in descending order. The length of the second language’s alphabet determined the amount of most-frequent Voychinese characters used for this test.

In this case, the source transliteration alphabet was V101; this means that the first 26-odd glyphs were used, the rest discarded. Here's the basic v101 character set, pinched with thanks from You are not allowed to view links. Register or Login to view.:
[Image: v101a_RZ.gif]

You can clearly see how many glyphs will have been left out of their analysis. It makes a mockery of the whole thing. They have chosen the 26 most frequently used glyphs in this transcription alphabet and ignored the rest.

Now, why would anyone who understood the VM continue with such an attempt?


RE: Statistical comparision of Voynich transliteration with common alphabets - RobGea - 12-11-2020

These folk have appeared before they are just Students having a go.


RE: Statistical comparision of Voynich transliteration with common alphabets - ReneZ - 12-11-2020

It's not so simple... (ever).

If one discards symbols whose combined occurrence in the text is less than 0.5 % one is making an error, but probably not a significant one.

In reality, people are doing these things arbitrarily, and not based on such quantitative considerations.


RE: Statistical comparision of Voynich transliteration with common alphabets - Koen G - 12-11-2020

It really depends. Some statistics are so broad, that rare glyphs don't really have an impact.

Moreover, one thing we forget is that standard manuscripts also contain tons of glyphs we don't take into account. So in a way, comparing every single Voynichese glyph to the standard alphabet is comparing apples and oranges.


RE: Statistical comparision of Voynich transliteration with common alphabets - davidjackson - 12-11-2020

(12-11-2020, 07:26 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.In reality, people are doing these things arbitrarily, and not based on such quantitative considerations.
That's the very word I was searching for!


RE: Statistical comparision of Voynich transliteration with common alphabets - -JKP- - 12-11-2020

That's the first time I've seen the V101 character assignments. I find it quite weird. It doesn't seem to bear any relationship to how they did things in the Middle Ages.

Also, as a computer programmer, I guess I look at things differently. If you assign things in certain ways, they are flexible, they can be aggregated or taken apart. If you assign things in other ways, they are rigid, and statistical studies are more complicated or more difficult. The V101 assignments do not strike me as being very flexible.

There's a good reason why simple ones and zeroes work well in the computer environments. Some of those concepts also apply to creating transcriptions that are not too deeply anchored in a specific system, that can be generalized and shaked and baked in a variety of ways.

That doesn't mean that computational attacks are the only way to look at the VMS, but some of the concepts that apply to computers apply to the human brain, as well. Some systems are easier to shuffle in the brain than others and if the assignments are too rigid, it can hinder that process.


RE: Statistical comparision of Voynich transliteration with common alphabets - ReneZ - 13-11-2020

JKP, note that the table shown by David is only half the v101 definition. There's another table for high-ascii assignments.


RE: Statistical comparision of Voynich transliteration with common alphabets - -JKP- - 13-11-2020

Thanks for that info, Rene.

I guess one of my main problems with the character assignments is that they are so un-medieval.

• For example, there is no "j" in medieval alphabets. When it looks like a "j" it is either a terminal-i (with a tail) or an embellished "i" (as in Iohannes). The "j" began gradually appearing during the Renaissance but was not typically used until long after the creation of the VMS. When there are "pen tests" and sample alphabets in the margins or foreleaves of manuscripts, they do not include "j". Maybe the VMS uses two different shapes for the same character, but that would reduce entropy even further.

• The same general idea applies to "u" and "v". There were two basic shapes, and perhaps there are symbols in the VMS in which more than one shape represents the same thing, but there was no specific sound distinction between u and v in medieval script. They were used interchangeably by most scribes. They eventually differentiated u and v, but not until later, when the printing press more-or-less began to standardize alphabets and letter usage.

• The other problem is there is no overt acknowledgment of possible ligatures (maybe I shouldn't say this until I've looked at the rest of the character set, but I don't see it in the chart posted here). Ligatures were as much a part of the medieval alphabet as individual letters. It's almost impossible to find manuscripts that don't have them. Some manuscripts have multiple ligatures in each sentence. That does not mean the VMS has ligatures, there's no guarantee, but... the possibility that they may exist has to be acknowledged in any character set intended to represent characters of medieval origin.