The Voynich Ninja

Full Version: Why and how the text could be Bavarian
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
(02-04-2026, 02:47 PM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.The real question now is this: does the German structure reflect an actual German source text, or a code built on German linguistic structure but without recoverable sense? I do not know that yet.

Hi Jojo_Jost!
Do you think you're capable of reading a whole paragraph now?
(04-04-2026, 02:19 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.Hi Jojo_Jost!
Do you think you're capable of reading a whole paragraph now?

Yes, and no. Well, theoretically yes, practically no. What does that mean?

The cipher can manage to find certain passages where the words it finds appear frequently; the rest could be “filled in” (I now have 300 possible words, but they haven’t been confirmed yet—that’s a lot of work). But this is “Eisigesis.” So it’s not a translation to be taken seriously. It would be cool for the effect, but it would still be far from the truth.

I’m struggling with another problem: I have the prefixes and suffixes I mentioned here, plus a few more that might work, and I have the “and.” Overall, this results in a clearly recognizable (!!!) German sentence structure that makes sense for the most parts of VMS, crasy enough.

But what’s missing—and I’m being honest here—are the core elements (not that I’d be the first to have this problem Wink ).

So instead of continuing to work on “verifying” individual words, I’m still searching for the Cipher of the core elements. And I think I just made a huge leap forward today (VMS is a little devil, as most of you know). As soon as it’s confirmed enough that I can present it, I’ll do so.

As I said, work in progress!
As I said, I wanted to take a closer look at the VMS cores. I ran into a few problems—no surprise there. But then I had a major breakthrough - (major?  Big Grin )

Actually, this was already hinted at in my cipher; the “y” was treated differently depending on its position. But so was the “o.” The problem was that I hadn’t fully grasped the extent of it myself.

Some, if not all, of the VMS glyphs are used differently depending on their position.

And that’s exactly what creates some of the strange structures of the VMS! It’s highly interesting.

In short: VMS is a position-dependent absorption cipher.

I’m still working on a few things, but here are the initial results:

Because of the different positions, it became clear to me that the vowels and “aiin” must be used in a position-dependent manner!

In this respect, the problem is that you can’t achieve consistency if you compare all vowels from the frequency data with the MHD corpus. So you have to Exclamation focus on the medial part and leave out the initial and final vowels.
For a VMS token with N glyphs, “middle” refers to glyphs 2 through N−1. Tokens with fewer than 3 glyphs have no middle position and were excluded!

The digraph “ee” (two adjacent “e” glyphs) is also treated as a single unit, since the hypothesis under examination is: “ee = MHD u”.

If you split it into two separate “e”s, you increase the number of medial “e”s by about 9,600 and get a completely different picture. This is the single largest cause of discrepancies between counts by different people, which is why I am pointing it out right at the start.

Everything else—including ai, dy, ey—is counted as two separate characters. The aiin sequence is handled separately (see below).

Concrete example: “chedy” is tokenized as [ch]-[e]-[d]-[y], so its middle glyphs are [e, d]. “okeedy” is tokenized as [o]-[k]-[ee]-[d]-[y], middle glyphs = [k, ee, d].

The “aiin” Problem

There is a well-known complication Wink . The sequence "aiin" (and its variants "ain," "air," "aiir") occurs in about 15% of all VMS lemmas, almost exclusively in final position (85–91% of occurrences). It behaves like a positional unit (probably an ending like -en/-em) and not like a string of independent vowels. But the "a" and the "i" within it are located in the medial position and are counted, which significantly inflates both the medial a and the medial i in VMS.

Raw medial counts including aiin: VMS a = 13.48%, VMS i = 12.75%. This is far too high for any plausible vowel pairing. As soon as one recognizes "aiin" as a functional unit and removes its components from the medial count (5,902 a, 9,670 i, 247 n/r), the picture changes dramatically.

The numbers

VMS source: EVA transcription v3b, 38,227 tokens.
MHD reference: Ortloff von Baierland, Breslauer Arzneibuch, Admonter Bartholomäus, recipe collection Cod.germ.1 – total of 170,892 words. (Normalized)

Medial glyphs before removal of aiin: 93,606.
Medial glyphs after removal of aiin: 77,787.

[attachment=15021]

The first three pairings land within half a percentage point. For o = e and a = a, the delta is 0.06% Exclamation each — on a base of 77,787 medial glyphs vs 374,665 medial letters. But to be honest, this is a very specific analysis. If you make minor adjustments or take a closer look at the whole thing, you get different results—results that are still good, but don’t yield such precise matches. I'd say that these perfect matches are partly due to chance. But it’s also more about the general direction...

What this means
Four medial vowel pairings look solid:

EVA o in word-internal position behaves like MHD e — the most frequent vowel in both systems.
EVA a stays a.
EVA ee (the digraph) behaves like MHD u.
EVA e = MHD i is plausible (13.53% vs 11.15%) but not as tight.
EVA i collapses to 2.91% after aiin removal, which means almost all medial i in the VMS belongs to aiin sequences rather than appearing as an independent vowel.


Important!

This is a very fascinating correlation in the frequency of vowels in the medial position.But it is still a long way from being deciphered. 

However, viewed once again from a different perspective, it demonstrates that the VMS possesses an internal vowel structure that corresponds, at the medial level, to medical texts from the Middle High German period, with specific and verifiable letter assignments.

(Note: The data was, of course, calculated by Claude but intensiv cross-checked by ChatGPT, with only minimal deviations, which are likely due to the cleaning of the Middle High German texts.)
Another important point when searching for the core elements is that the gaps in the text are not simply spaces, but, in my opinion, syllabic breaks.

We all know that:
y → q: 3,499 times via the space bar, only 27 times internally!
y → o: 2,625 times via space
n → ch/sh/o/d: almost exclusively via space
y → l/r: almost exclusively via space
l → q: almost exclusively via space

The pattern: y, n, r, l almost always end a token. q, o, ch, sh almost always begin a token. The spaces are thus determined by the glyphs themselves, not by anything else—in this respect, they are structural breaks, not necessarily word boundaries!

What does this mean for the model?

We will be dealing with a two-stage positioning—especially the prefixes, and here particularly the “y,” will represent one or more bi- or trigrams that can appear both within a word and at the end of a word.

In German, for example, the most common final syllables also appear within words, such as: “en” in Middle High German: 22.6% within words, “er”: 30% within words, “el”: 69% within words, etc.

In this respect, Eva y / L / r will likely play a dual role precisely there. Which one?

Yes, that is the question of all questions. While vowels are still relatively easy to identify through frequency analysis, consonants are significantly harder to find. Especially when a VMS glyph absorbs more than one consonant. For the system to work, they must absorb an average of about three. And that is nearly impossible to decipher by simple means.

So we have to find new ways—I’m on it....
It would be so elegant. You could also read ‘daiin’ differently 

[attachment=15044]

namely as d a i n
then ‘aiin’ would be ‘a i n’
‘ain’ would be ‘an’

Assuming that d=t/ts (in Bavarian, initial s hardens to ts, so sein = tsain)

‘aiin’ = ‘ein’: Count all tokens containing ‘aiin’ in the VMS EXCEPT ‘daiin’ tokens. Count all words in Middle High German containing ‘ein’, including the Bavarian variants (at the beginning, in the middle, at the end) — i.e. ‘ein’, ‘eine’, ‘einem’, ‘wein’, ‘bein’, “kleine”, ‘seinen’, etc.

daiin = ‘sein’: Count all tokens containing daiin in the VMS. Count all forms of the verb sein (ist, bin, bist, sind, sint, war, sei, si, sin, sein... including the Bavarian variants) PLUS all words containing ‘sein’ (seinen, seiner...) in the MHD.

ain = ‘an’: Count all tokens in the VMS that end in ain EXCLUDING aiin/daiin. Count all words in the MHD that contain ‘an’ (an, man, wann, dann, hand, lang...).

Here is the astonishing result:

[attachment=15043]

It fits surprisingly well; only ‘aiin’ is skewed, but in the ‘recipes’ you can see that it wouldn’t be impossible if the VMS were to be recipe-like. But what would that have to do with a cipher? An interesting question, because then it would essentially be plain text.

Could it really be that simple? I have my doubts, which is why I wrote “it would be so elegant” in the subjunctive above. There is another reason why I’m raising this here at all:

The longer I work with the VMS as a German manuscript, the more I get the impression, as has already been suggested here several times, that the VMS is not a cipher in the strict sense at all, but perhaps simply an attempt at stenography – a form of stenography that never caught on....

So I asked myself what forms of stenography existed:

Tironian notes (the dominant system)

The only truly codified shorthand system still known around 1400. Origin: Marcus Tullius Tiro, Cicero’s secretary, c. 63 BC. Scope: up to 13,000 characters in its medieval form.

Structure:

Letter-ligature principle: basic characters are modified by additional strokes, loops and hooks

A mixture of syllabic and word characters. Not a pure alphabet, but a mixed system of logograms, syllabic characters and abbreviation rules: Direction: left to right, no reversal. This does not fit at all due to the reduced number of glyphs in VMS, although some Tironian notes bear certain similarities to individual VMS glyphs.

The system was still partially in use in Germany around 1400 (St. Gallen apparently still had a large Tironian corpus) but died out in the 15th century.

Scribal abbreviations (not a stenography system in the strict sense)

Not stenography but functionally similar: highly developed abbreviation conventions in medieval manuscripts.

Various types:
Contractions: first and last letters of a word, rest omitted, with an abbreviation mark above
Suspensions: word breaks off, rest implied
Superscript characters: superscript letters replace syllables
Special symbols: e.g. the Tironic ‘et’ sign (resembling a 7), which survives to this day as ‘&’ (here possible the eva S)

These conventions were largely standardised in Latin texts, but varied regionally in vernacular texts. There are some indications in the VMS that Latin abbreviations were known. Here, of course, a scribe might have attempted to adapt such a system to German.

Personal / idiosyncratic systems Exclamation

This is where it becomes relevant for VMS research:

There is no conclusive evidence of a standardised German-language stenography system in the early 15th century.

But:

Individual scholars and doctors developed personal abbreviation systems for notes, collections of prescriptions and lecture notes
These were not codified, could not be taught, and were not passed on
University environments (e.g. Vienna, Prague, Heidelberg, Krakow) produced such personal systems
Medical practitioners (barbers, surgeons) had their own abbreviation habits for prescription formulas.

That would, of course, fit perfectly.

Characteristic of personal systems around 1400:

Almost always based on one’s own handwriting as a starting point

- Often vowel reduction (vowels omitted or replaced by diacritical marks) – because the consonant structure identifies the word. In the VMS we probably have vowels.

- Syllable abbreviations rather than letter abbreviations – this is where it gets interesting, of course, but precisely this would naturally be difficult to substantiate historically for the VMS, because such papers were usually not kept if no one could understand them anymore.

But: Perhaps one should look more closely at this aspect of ‘encryption’ than at cryptology...
Sorry if you caught some strays on the Chinese thread, but that is my concern about this in a nutshell. Until it starts cohering into a larger translation I think this hasn't gotten past the problem that you can match ANY language to the VMS using a frequency analysis. I recognize there are other suggestive threads here, but they haven't quite come together yet
@ rickforto No Problem  Wink

On the one hand, that is a valid objection, but at the same time you are overlooking some of the distinctive features of the theories presented here.

Adjusting the frequency is not actually what I do. There is no question that anyone can adjust a frequency distribution retrospectively – you are right about that. What I do is something different: I create a model based on Bavarian MHD morphology and grammar and ask what code structure necessarily follows from it. The answer exhibits VMS-like properties, without my referring to the specific features of the VMS in the construction itself.

A concrete example: "qo". A pure frequency analysis would not suggest that prepositions and articles are being absorbed here. Nor would a frequency analysis be able to identify the o as an article. Quite apart from the fact that the y absorbs not only "ge" but also a range of other prefixes.

These are not coincidences that could occur in any language. The absorption of articles into a prefix, the inseparability of prepositions and articles into a single morpheme, the greater typological diversity of prefix-initial tokens – these properties arise inevitably from German morphology. Try this with Latin. Try it with Arabic. The structure does not emerge. It arises only from German.”

The question of why "y" is one of the most frequent suffixes can only be explained by absorption – a frequency analysis would, in principle, merely have established that there is no equivalent. If one were to attempt this with "y" in Latin, it would fail, as has indeed happened often enough.

It is therefore by no means as simple as you describe it.
What I have not presented here in these posts are the many other analyses I carried out in order to arrive at these results in the first place.

The frequency analysis is merely the part that confirms this and which I use, or have used, to compare the frequency of letters/glyphs.

However, I have since moved away from this approach, as it simply does not work with the Kerns. I am currently turning the tables and have answered the question of how to encrypt a text using a German MHD text, an absorption cipher and 24 glyphs in such a way that a cipher is created, and I am checking the similarities with VMS texts. The results are surprisingly consistent, which is due to the morphological structure of MHD itself - not to any adjustment on my part. As soon as I've finished, I'll publish it here too.
(15-04-2026, 10:00 PM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.The question of why "y" is one of the most frequent suffixes can only be explained by absorption – a frequency analysis would, in principle, merely have established that there is no equivalent. If one were to attempt this with "y" in Latin, it would fail, as has indeed happened often enough.

Bullshit

[attachment=15127]

It fits perfectly in Latin.
I explained exactly how and why that is.
Even “daiin” = tutis, related to the Italian “tutti”
@ Aga Thanks for misunderstanding me so badly and then immediately insulting me so nicely Wink

I had written: The question of why “y” is one of the most frequent suffixes can only be explained by absorption.
That is exactly what you are applying here as well—y as absorption.

Then I shifted the topic to frequency analysis: a frequency analysis would, in principle, merely have established that there is no equivalent. If one were to attempt this with “y” in Latin, it would fail, as has indeed happened often enough.

This refers to frequency analysis, not to absorption - the comparison in the post is precisely about that frequency analysis...
(04-04-2026, 04:29 PM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.
(04-04-2026, 02:19 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.Hi Jojo_Jost!
Do you think you're capable of reading a whole paragraph now?

Yes, and no. Well, theoretically yes, practically no. What does that mean?

The cipher can manage to find certain passages where the words it finds appear frequently; the rest could be “filled in” (I now have 300 possible words, but they haven’t been confirmed yet—that’s a lot of work). But this is “Eisigesis.” So it’s not a translation to be taken seriously. It would be cool for the effect, but it would still be far from the truth.

I’m struggling with another problem: I have the prefixes and suffixes I mentioned here, plus a few more that might work, and I have the “and.” Overall, this results in a clearly recognizable (!!!) German sentence structure that makes sense for the most parts of VMS, crasy enough.

But what’s missing—and I’m being honest here—are the core elements (not that I’d be the first to have this problem Wink ).

So instead of continuing to work on “verifying” individual words, I’m still searching for the Cipher of the core elements. And I think I just made a huge leap forward today (VMS is a little devil, as most of you know). As soon as it’s confirmed enough that I can present it, I’ll do so.

As I said, work in progress!

Just wanted to say hi JoJo you are working hard.

Your work sounds difficult with what seems like success then you find forks in the road.  That's expected with MS-408 when committing it to a language that has rules for the grammar.  Just when you find a phrase and then process the vords else where you can hit a snag.  Lets see folio 58r follow the rules of the language you choose.  Then commit your cipher to f68r3; you do that and it makes sense. I will be in wonder!!!!!
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18