Hi everyone,
New member here. I want to be upfront about a few things before presenting what I've been working on, because I know this community has seen a lot of "I've cracked it" posts, and I don't want to waste your time.
What this is not:
- A decipherment
- A translation
- A claim that I know what language the VM is written in
What this is:
- A morphological model that decomposes ~30% of the VM vocabulary cleanly using a prefix-root-suffix system
- A hypothesis (the "corrupted copy" hypothesis) that explains why the remaining 70% resists decomposition
- A set of testable structural predictions, some of which I've checked against known data and which seem to hold
How it started:
This came out of a completely unrelated discussion about the Phaistos Disc, where I was exploring structural-functional approaches to undeciphered texts with an AI language model (Claude). On a whim, I tried applying the same approach to the VM — building a synthetic grammar from scratch based purely on internal structure, with no prior assumption about what language family it belongs to. I expected it to fail. It didn't fail as completely as I expected.
The core idea in brief:
The VM text behaves like an agglutinative language with stable prefixes (
qo-, ch-/che-, sh-, da-, ol-/o-), productive roots (
ke-, te-, ka-), and grammatically meaningful suffixes (
-dy, -y, -in, -ain). These decompose the highest-frequency words cleanly:
qokedy, qokeedy, qokeey, chedy, shedy, daiin, dain, and their families.
The model makes specific positional predictions that appear to hold:
- Words ending in
-in/-ain avoid line-final position (they mark continuation)
- Words ending in
-y can close lines (terminal)
-
da- words (
daiin, dair) are strongly line-initial (>20% of occurrences)
I then tested this against the full You are not allowed to view links.
Register or
Login to view. transcription (Takahashi version from the Stolfi interlinear) line by line. Results: roughly 30% clean decomposition, 38% partial, 32% fail. The failures concentrate on gallows characters, which the model doesn't address at all — that's the biggest gap.
The "corrupted copy" hypothesis:
The reason I think the remaining 70% is noisy rather than wrong: the VM may not be an original composition. If a 15th-century European scribe copied an older text in a language they couldn't read — character by character, purely as visual patterns — you'd expect exactly the pattern we see: high-frequency morphemes preserved (because the scribe's hand learned them as motor patterns), interior morpheme boundaries smeared, rare forms absorbed into common ones, and vowel distinctions (the e/ee/eee system) rendered inconsistently.
This would also explain why the VM has natural-language statistics but resists decipherment: the source was a natural language, but the copying process added a layer of systematic noise.
Typological direction (most speculative part):
The morphological profile — agglutinative, prefix-based, ergative-looking alignment (
da- as possible ergative marker), SOV-compatible word order,
r/l alternation in the
ol/or/al/ar system — doesn't match any European language. It does align typologically with Hurrian, Urartian, and Northeast Caucasian languages. I'm not claiming the VM is in Hurrian. I'm saying the
type of grammar matches that corridor better than anything in Europe. The Diakonoff-Starostin "Alarodian" connection means these structural features are shared across multiple families in the region.
What I'm looking for from this community:
1.
Has anyone tested positional constraints of suffixes systematically? The
-in/-ain line-avoidance and
da- line-initial preference are the model's strongest testable predictions. If these are already known/published, I'd like to know.
2.
Gallows integration. The model completely ignores gallows characters (~15-20% of the text). If anyone has ideas about how
cth/ckh/cph/cfh might fit into an agglutinative prefix system, I'm very interested.
3.
Currier A vs B. The model currently treats the text as uniform. If A and B have different morphological profiles, that's important — it could mean different source texts, different scribal hands, or dialectal variation.
4.
Where am I reinventing the wheel? I'm new to VM research specifically. If someone has already proposed an agglutinative model, or tested prefix/suffix positional behavior, or explored Caucasian typological parallels, please point me to their work. I'd rather build on existing research than duplicate it.
5.
Where am I obviously wrong? I can take it. That's why I'm here.
Full paper attached. It includes the complete morpheme inventory, decomposition test results with line-by-line You are not allowed to view links.
Register or
Login to view. analysis, the corrupted copy argument, typological comparison with Hurrian/NEC languages, historical transmission scenarios, and full references.
Transparency note: This was developed collaboratively with Claude (Anthropic's AI). The hypothesis and direction are mine; the systematic testing, frequency analysis, and typological comparison were done with AI assistance. The AI was also used to stress-test the model — the initial assessment was actually quite harsh, identifying major gaps (gallows, semantic unfalsifiability, cherry-picked examples) before we refined the framework. I mention this because I think the methodology is legitimate and worth being honest about.
Thanks for reading. Looking forward to being told why I'm wrong.