The IVTFF cleanly separates captions (next to drawings) from the rest of the text.
I thought that captions could be semantic rich, and maybe different from the rest of the text from a grammar perspective?
Looks like they are.
Captions follow the following pattern, overwhelmingly: o-K-V-F (o + stop + vowel + final consonant), repeated 1-2 times. The ch/sh and e slots are used sparingly. What's more,
captions dont carry the A/B distinction signal.
So... what if captions where surfacing semantically meaningful words, whereas words not containing o-V-K-F where just "elaboration"?
We would then have
two channels in Voynich.
I ran some stats, and look at the results:
Semantic channel (o, a, t, k, p, f, d, l, r, n, m, y, s): stable across sections, preserved in captions, 73% of all glyphs
Elaboration channel (ch, sh, e, ee, q, i, ii, cXh): varies by line position, varies by A/B "language," largely absent from captions.
Semantic words have 29% redundancy => reasonable,
close to natural language
Elaboration has only 3.2% redundancy => it's essentially
memoryless. It barely depends on the previous elaboration. This is consistent with elaboration being either random padding, a simple positional marker, or an independent cipher layer.
Entropy Comparison
Each Voynich word carries approximately:
- 7.73 bits of semantic core information (the message)
- 2.78 bits of elaboration information (position + dialect + some morphology)
- Total: 10.51 bits per token
The elaboration's 2.78 bits decompose further into:
- PREFIX (~1.76 bits): primarily encodes line position (ch/sh at start, ∅ at end, q in middle)
- INFIX (~3.38 bits): encodes section dialect and some core-specific morphology
These two sub-channels share only 0.147 bits of mutual information (8.4% of prefix entropy), they are nearly independent.
Let me make two conjectures:
1) the Voynich has two layers: a semantic core (73% of glyphs, 49% of vocabulary) and an elaboration layer (27% of glyphs, carrying almost no sequential information)
2) proposed word architecture: [ELAB_PREFIX] + [SEMANTICAL_PREFIX] + [SEMANTICAL_STEM] + [SEMANTICAL_SUFFIX] + [ELAB_INFIX] + ...
ch/sh/q o/a/∅ k/da/ka/... y/n/l/r/m/∅ ii/ee/e/i
Example decompositions:
Full word Elabprefix SemPrefix Stem SemSuffix Elabinfix
chol ch o — l —
okaiin — o ka — ii ...
daiin — ∅ da — ii ...
shedy sh ∅ — — e + (d=stem, y=suffix)
qokeedy q o k — ee ...
Thoughts?