Although I can find many words using the Bavarian theory, the sentences are all very difficult to interpret. Even if that's enough for many other solvers, I can't accept it. The more I delved into the cores, the more I realised that there must be another level of organisation. I am currently searching for it. Here is a small interim result:
I was wondering whether certain VMS characters are grouped together depending on their position. To test this, I marked each space with a Roman numeral: the first space in a line is I, the second II, the third III, and so on. I divided the first word into the first glyph and the word excluding the first glyph.
I then examined which characters appear before the respective spaces – not just the single character immediately preceding it, but all characters within the token.
I also tested certain tokens (qo, aiin, ain, daiin, dy, ed, ee) as single units, because they show very strong internal binding in the data.
For each position, I calculated the proportion of each unit out of all glyphs at that position and then divided this by the unit's proportion in the entire corpus. This yields a ratio:
Even though much of this was already known,
this heatmap illustrates it. That's why I wanted to publish it (without knowing whether anyone else had done so before).
1.0 = neutral, the unit appears with the expected frequency
1.0 = overrepresented at this position
< 1.0 = underrepresented
Heatmap: red = overrepresented, blue = underrepresented, white = neutral.
Conclusions that follow from this:
1. It seems that VMS lines have a position-dependent structure.
The distribution of glyphs changes systematically from the beginning to the end of the line.
2. There are at least four functionally distinct zones (nothing new, I know):
Position I (first token): Marker-dominated. p, s, d, t, r are heavily overrepresented. Other glyphs (m, ckh, cth) are heavily underrepresented.
Positions II–III: Cluster zone. sh, ch, ckh, cth, cph peak here. qo peaks in III.
Positions IV–VIII: Running text. ed, dy, aiin slightly elevated.
Positions IX–XII: Vowels and liquids dominant. a, l, m explode. sh, qo tend to disappear.
3. qo behaves like a unit, not a compound glyph.
Positionally, qo acts like a single marker. The 0.01 ratio at Pos I' (first token without its first glyph) makes that quite clear – if qo appears in the first token, it is practically always at the very start, never in the middle.
4. The aiin family (ain, aiin, daiin) also behave as units.
Positionally, they tend to act like individual structures, not like compound sequences.
5- Fixed bigrams (ee, ed, dy) are morphologically relevant.
ee is slightly elevated in the opening zone (Pos I'–III),
ed peaks in the late middle (Pos VIII),
dy rises slowly up to Pos VIII.
These three are not distributed randomly, but follow positional rules.
5. The Bank Gallows seem to form a functional class.
ckh, cth, cph (and cfh, despite the small sample size) all peak simultaneously in positions II–IV. They behave in unison.
This supports the idea that they belong together as a class, not only graphically but also in their role within the line.
6. m is the strongest positional signal in the entire VMS.
From 0.37 (pos II) to 6.10 (pos XII). The glyph "m" shows a very strong bias towards line ends (I know, this is already known, too). Not just a word-ending suffix, but something really tied to the end of the line.
What follows from this:
1. The line is a structural unit with internal grammar. This does not look like random text. There are rules governing where certain glyphs may appear.
2. This structure feels more rigid than what you'd expect from a free natural language. I haven't compared it directly against an MHG corpus with the same method yet, so take this as impression, not proof. The pattern looks very formulaic (I know, nothing new either).
3. The lines are not sentence units, so a word-for-word decipherment aimed at forming sentences will not work. It's a shame, really.
In short: lines are likely to be structured units. However, given the period in which this cipher was created, it shouldn't be too complicated.
Let's carry on
Jojo