@ Tavie
I'm a journalist—believe me, I don't believe anything or anyone, not even myself. I'll probably be the last person to be convinced that I have the solution, even if I pretend I do when I actually have it.
I have nothing against criticism; that's fine. It’s just that when it becomes clear the critic hasn't even looked into the subject they're criticizing, that annoys me because it wastes my time. Otherwise, I respond very politely to criticism, don't I? After all, I'm human and I make mistakes—quite a lot, in fact...
That were good Questions
1. Words with "sa" and "so"
You’re right that words with "so" and "sa" make up the majority of words that start with "s".
But we should take a look at which words these mainly are.
Words with "sa": saiin (119x), sar (77x), sain (62x), sal (48x)
Words with "so-": sol (59x), sor (47x)
Currently, I have aiin as "mit" = "with" (though this changes frequently). Then saiin would be "und mit" = "and with" fused into a single word.
"mit" is one of the more frequent words following “und” in my Middle High German reference corpus: 287 out of 10,542 "und" occurrences. So this is actually not a problem for the hypothesis—it actually tends to confirm it.
As for the standalone "s" followed by a word: The first characters of the next token are “ai” (25% aiin), "o" (20%), "ch" (16%), "a" (15%). This could be something like a mix of possible content beginnings (ch, sh) and functional elements (o = article präfix, ai = part of "aiin" see above). Important: No single character dominates so strongly that it would stand out.
2. Why does "sq" hardly ever appear?
Yes, a realy good question! But "sq" appears exactly twice in the entire manuscript. But a standalone "s" followed by a "qo" word occurs eleven times (s qokedy 3x, etc.). So "and + preposition" does exist, but it is very rare.
Why so rare? Because "qo" itself is a prefix. The system does not stack prefixes. You cannot have "s+qo" as a double prefix on a single token, so "and + preposition" remains as two separate tokens: "s" + "qo-word."
The frequency check: In MHD, "and" is followed by a preposition in 9.1% of cases (964 out of 10,542). In VMS, a standalone "s" followed by a "qo"-word accounts for 3.6% (11 out of 309). That seems low. However, if "saiin" absorbs "und mit" instead of forming "s qo...", we must count this separately. If we include "saiin" again: About 5.5% of all "s" contexts contain a following preposition. The remaining gap may be explained by other absorbed forms that I have not yet identified.
I think that "sq" is virtually impossible, but this does not pose a real problem. It is a prediction. A notation system based on prefixes should not allow prefix stacking—but, in principle, I have too little information to say that with certainty.
3. I haven't yet looked closely enough at the LAAFU pattern in this context - I still need to do that. There are several VMS "anomalies" that I haven’t incorporated yet because there are simply too many of them.