After three rounds of independent peer review and full public replication setup, here is Version 5 of my Andalusian Arabic proposal for the Voynich Manuscript (Beinecke MS 408).
Key features (all honest and falsifiable):
• 21% firm single-meaning lexical coverage (28 word types + k = kull family of 73 types → 7,720 of 36,473 tokens). The old 87% polyvalent claim was retracted in Version 2.
• Blind test on 20 unseen folios: 29.9% overall coherence, rising to 30.2% in herbal/pharmaceutical sections (vs 3–5% chance baseline). Three zero-match folios are reported honestly.
• Forward-only decoding of three prescription sentences, including f6r.12 (“garlic with borax — many doses”) validated against two independent medieval sources (Ibn al-Baytar Ch.7 and Abu l-Ala Zuhr Fragment 2) that were not consulted during decoding.
• Complete morphological grammar skeleton (o- = wa-, qo- = qad, -dy = dhi, etc.) aligned with Andalusian Arabic (Corriente 1997).
Full PDF attached (or available on request).
Public replication dataset (EVA corpus, character table, all CSVs, blind-test raw data, etc.):
You are not allowed to view links.
Register or
Login to view.
Explicit invitation to Arabic linguists:
Apply the character table to any 10 folios of your choice and check the output against Corriente (1997) or Lane’s Lexicon. Report the valid Arabic root match rate.
Results >10% support the hypothesis. Results at 3–5% (chance baseline) would falsify it. This is an open, reproducible test.
Figure 1 in the paper shows the f6r.12 garlic-borax sentence with actual folio image and mechanical transliteration.
Looking forward to serious feedback and independent testing — especially from those who read Arabic or medieval pharmacology.