The Voynich Ninja
Arabic as precursor language - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Arabic as precursor language (/thread-4202.html)



Arabic as precursor language - dfs346 - 16-03-2024

Further to my recent post in another thread, on Arabic as a possible precursor language of the Voynich manuscript, I tested a range of alternative transliterations of the Voynich text, all based on Glen Claston's v101 but differing from v101 in one or more respects. I numbered these transliterations v101④ through v202. (The ④ signifies that in all the transliterations, I treated the v101 glyph pair {4o} as a single glyph, to which I assigned the Unicode symbol ④.)

For comparison of the Voynich text with the Arabic language, I used Arabic letter frequencies derived from the works of Ibn Kathir (1300-1373)

In order to test my Voynich transliterations, I started by calculating the statistical correlations between the glyph frequencies and the Arabic letter frequencies. However, with two short descending sequences such as ibn Kathir's Arabic alphabet (which has 43 letters), and the 43 most frequent glyphs in the v101 transliteration (which account for 98.6 percent of the text), it is relatively easy to obtain correlations well in excess of 90 percent. Substantial differences between transliterations (for example combining the {2} group of glyphs) result in quite small changes in the frequency correlations.

I therefore adopted an alternative metric, namely the average frequency difference. Mathematically, this is the average of the absolute differences between the frequency of a precursor letter and the frequency of the equally ranked Voynich glyph. My idea was that the lowest average frequency difference should represent the best fit between a transliteration and a presumed precursor language

On this metric, I found that the transliteration which I had numbered v171 was the best fit for ibn Kathir's Arabic alphabet. Apart from the treatment of {4o}, the v171 transliteration has the following differences from v101:
  • m=IN
  • M=iIN
  • n=iN.
Below is a juxtaposition of the frequencies of the top 43 glyphs in the v171 transliteration, and the 43 Arabic letter frequencies. The average frequency difference between v171 and Ibn Kathir's Arabic is 0.64 percent. 

   

The next step is to explore the potential of these juxtapositions as correspondences or mappings. For example, the Voynich {o} could map to and from the Arabic ا (alef). Thereby, we could map some of the most common Voynich "words", such as {8am}, {oe} and {1oe}, to text strings in Arabic. We could then search appropriate corpora of the Arabic language, for example ibn Kathir's The Beginning and the End, to determine whether these strings are real words.

Since Arabic uses an abjad script, in which the short vowels are not written, chances are that most of the Voynich "words" up to three glyphs will map to real words in Arabic. However, as with Persian, the mapping may well break down with "words" of four glyphs or longer. Even if we are able construct real words of four letters or more, when arranged in sequence they may or may not make sense. I will do some tests. More later.