The Voynich Ninja

Full Version: Experiments with mapping
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
When I first started juxtaposing the frequency table of Voynich glyphs with the frequency tables of selected medieval European languages, an experiment naturally suggested itself. This was to take a random page from the Voynich manuscript and map glyphs to letters, one by one from the top of the frequency table to the bottom. This might produce a few recognisable words; or it might not.

I did indeed try a few experiments of this nature, starting with the v101 transliteration as the source document and with medieval Italian (as per the OVI corpus) as the destination language. I did the mapping in Microsoft Excel, which has a convenient “find-and-replace” function. Since Excel tends to treat upper and lower case as the same, I first replaced all upper-case keyboard assignments in v101 with similar lower-case Unicode characters: for example, v101-K became ǩ and v101-W became ŵ.

I also made the decision to replace all occurrences of v101-4o (which I believe is a single glyph) with the Unicode character ④.

Not unexpectedly, these experiments did not yield recognisable words. For example, having randomly selected page f090r1, the first two lines:
  • goeccoe ④hcoe ④ŵ1o8 1oe9 ǩop / 92oe koy 2coy ④k1oy ④h9 8ayaea
mapped to:
  • BEROOER CTOER CŵNEL NERA QUEF / APER DES POES CDNES CTA LISIRI.
(In this instance, the v101 glyph ŵ could not be mapped at all, since this glyph is relatively rare, and the medieval Italian alphabet had only 33 letters including accented vowels. So I ran out of Italian letters before I ran out of Voynich glyphs.)

However, later on when I became aware of the Sukhotin algorithm, it made sense to try a variant of this approach, with vowels distinguished from consonants. With the v101 transliteration, the Sukhotin algorithm (as implemented by Dr Mans Hulden’s Python code) identifies the following v101 glyphs as the most probable vowels (in descending order of probability):
  • o, a, 9, c, ④, C.
(There are six vowels here, but we might need to conjecture that C could be a double c.)

That enabled me to construct another juxtaposition of frequency tables, as follows:

[attachment=8014]

This in turn permitted another series of experiments with mappings, on which I will report in another post.
Having juxtaposed the frequency table of v101 glyphs with the frequency tables of selected medieval European languages, and having first grouped the glyphs into probable vowels and probable consonants on the basis of the Sukhotin algorithm, I randomly selected a page from the v101 transliteration for a test mapping.

For this test, I used page f029v as a source document, and again tried medieval Italian (as per the OVI corpus) as the destination language.

This experiment again did not yield recognisable words. With f029v, the first two lines:
• hoom #oy 1ck19 oe aes 29k19 ǩ9#9 1o 29 8am / ④k1cam s 1oe 1oe ǩ9 1c9 ǩoe8 9k1oy 8ay9
mapped to:
• TEEC #ES NODNA ER IRG MADNA QUA#A NE MA LIC / UDNOIC G NER NER QUA NOA QUERL ADNES LISA.
(In this example, the v101 glyph # was too rare to have any equivalent letter in Italian.)

The Sukhotin algorithm certainly imposes more discipline on the mapping, in the sense that in the mapped “words”, vowels alternative with consonants more often than not (which was Sukhotin’s crucial insight). Consequently, the mapped “words” are generally pronounceable. However, with a few exceptions, they are not Italian words.

The next step was to incorporate Prescott Currier’s seminal insight that, if the Voynich manuscript has natural languages as precursors, it has more than one. I started therefore with Currier’s division of the Voynich pages into Language A and Language B (recognising that Currier did not attempt to assign a language to every page in the manuscript).

With the v101 transliteration, the Sukhotin algorithm (as implemented by Dr Mans Hulden’s Python code) identifies the following v101 glyphs as the most probable vowels (in each case, in descending order of probability):
• Language A: o, a, 9, ④, c, A
• Language B: c, a, o, 9, ④, C

This dichotomy yields another juxtaposition of frequency tables, in which the A pages have a different frequency distribution from the B pages, as follows:

[attachment=8016]

Consequently, given any selected precursor language, the glyphs will map differently from A pages than from B pages.

I am working on a series of experiments with A and B mappings, on which I will report in another post.