Note 1. The following analyses were computed using Claude. Once you look at the word lists, you'll understand why there was really no other way.
I checked the Python code myself, as far as I understand it as a non-programmer and had other AI systems rigorously cross-check the values. I've taken random samples too, but I still can't offer any guarantees — AI is AI, after all…
Note 2. ED "DG97EEB" kindly shared his truly comprehensive and detailed study on Voynich grammar with me, which has partly confirmed and significantly advanced my own research. I'm deeply grateful to him for that.
Content:
I was curious whether the ideas behind my cipher approach, which I've presented here before, might actually fit the typical features of the VMS.
So I ran some quantitative comparisons between VMS text and real Middle High German (MHG) medical manuscripts.
The texts I compared:
Eva = a cleaned-up version
Ortloff von Baierland (first 100 pages without the table of contents, transcribed via Transkribus — there are some errors, but it should have been good enough for this sort of comparison work)
The Breslau Pharmacopoeia
Admonter Bartholomäus
A collection of cooking recipes
All MHDeutsch with Bavarian influence, 15th century and earlier. (The texts were tokenised with proper Unicode normalisation for medieval scribal characters.)
The numbers:
What I tested: I tested the hypothesis that the VMS contains medieval Bavarian German, encrypted using a system where function words are systematically absorbed into content words as prefixes.
If this assumption were correct, you'd expect it to show up in the distribution. Those 2–3 character function words no longer appear as separate tokens — they're embedded in the content words, which makes the average VMS word longer. The ~30 percentage point gap in the ≤3 category fits rather nicely with the combined frequency of articles + prepositions + conjunctions in the MHG texts.
Note: For the prefix “o,” only words longer than 2 letters were counted. I am convinced that o / ol / or / etc. represent separate, distinct words.
Articles (o Präfix): There's a clear convergence with the VMS here. The values naturally depend on what's being described and on the author's personal style as well. That's why we see noticeable differences between the German texts themselves. And of course, we can only guess what exactly was lumped together under "articles."
Prepositions (qo Präfix): We see more pronounced outliers (marked in red), but also a clear match with the cooking recipes. Which proves once again that it very much depends on what's being described.
The
daiin/"und" rate doesn't fit. Daiin is one of the most frequent words in the VMS, and "und" is an extremely common word in German. This remains the weakest point of the model and requires further explanation, possibly depending on text type or encoding function.
Verb prefixes (y Präfix): The y-prefix in the VMS sits at 4.1%, while the MHG verb prefix rate (ge-, ver-, be-, er-, zer-, ent-) comes in at 3.6–4.4% across all four texts. That's a remarkably tight fit
Average word length is also interesting. It's higher in the VMS than in the original texts. In this context, that supports the theory that the shorter words were absorbed into the main words via the qo prefix.
What I find particularly interesting about the word length distribution is this: it's measurable independently of any decryption hypothesis. Whatever language is hiding in the VMS — it needs to explain why words of 2–3 characters are so dramatically underrepresented. In natural German (and most European languages), they account for roughly half of all words. In the VMS, it's only 19%. That rules out quite a few hypotheses — or at the very least forces them to offer a mechanism that makes short words disappear.
Now, what does all of this actually tell us about the likelihood that the VMS could be Bavarian German? On its own, none of these metrics prove anything — but taken together, they paint a rather consistent picture. Article rates, preposition rates, word length distribution, average word length — they all land in ranges that are compatible with medieval Bavarian German once you account for an absorption cipher. None of these metrics contradict the hypothesis. While each metric alone proves nothing, their convergence across multiple independent texts is notable.The one that fits least well (the daiin/"und" rate) is also the one most sensitive to text type.
What does this actually tell us about the probability that VMS is Bavarian German? Taken individually, these figures prove nothing – but together they paint a fairly consistent picture. Article frequency, preposition frequency, word length distribution, average word length – all of these fall within ranges that are compatible with medieval Bavarian German, taking absorption encryption into account. None of them provide a result that clearly contradicts the hypothesis.
I know that this is not proof, of course. But if the VMS had nothing to do with German, one would expect at least some of these metrics to deviate significantly. However, they do not. They are consistently within the correct range across four different (in some cases very long) comparison texts.
Let's put it this way: for a 600-year-old manuscript that no one can read, that's not a bad starting point, at least.
....................
Artikel / Articles (69): ain, aine, ainem, ainen, ainer, aines, ayn, ayne, aynem, aynen, ayner, aynes, d, daer, das, dat, daz, de, deer, dem, deme, den, dene, der, dere, des, dez, di, die, diese, diesem, diesen, dieser, dieses, dirre, dise, disem, disen, diser, dises, disiu, ditz, diu, dr, dy, dye, dz, eem, een, eer, ees, ein, eine, einem, einen, einer, eines, eyn, eyne, eynem, eynen, eyner, eynes, eyns, te, tem, ten, ter, tes
Konjunktionen / Conjunctions: "und" (daiin): un, und, unde, unt, vn, vnd, vnde, vndo, vnnd, vnt
Präpositionen / Prepositions(65) ab, abe, an, ane, auf, auff, aus, auss, auz, bei, bi, bis, biz, cze, czu, dar, durch, fur, fúr, für, gegen, in, inne, mit, mite, mitt, myt, nach, neben, nebent, ob, over, sunder, uber, uf, uff, umb, umbe, under, unter, unz, uon, uor, us, uz, vff, vnder, voer, von, vor, vore, vss, vur, vür, wider, yn, ze, zem, zen, zer, zu, zue, zuo, zwischen, über
Verbvorsilben / Verb prefixes (6) ge-, ver-, be-, er-, zer-, ent- (length > prefix+1)