The Vowel Bridge Model fits remarkably well into the statistical peculiarities of the VMS - an overview
I would like to take a step back and summarize why the Vowel Bridge Model (VBM) is, in my view, a serious candidate for a solution to the VMS.
First of all, an important point: the VBM does not invent everything from scratch. It builds on observations and work that are already well known in Voynich research: Stolfi's slot theory, Currier's A/B observations, the work on LAAFU, the studies on line starts and line endings, and observations by Elmar Vogt, tavie, Emma May Smith, and Patrick Feaster. Feaster's rules on word boundaries are especially important here.
The VBM takes these findings seriously and is based on these observations, but it places them in a different context.
First, here’s a brief overview of how the model works:
The decisive difference from many other solutions and statistical approaches is very simple:
EVA spaces are no longer treated as secure plaintext word boundaries.
Instead, the text is read as a continuous consonant/vowel stream:
V1 | C1 | C2 | V2
Here, "V" is not simply a single letter, but a bigram, a vowel bridge across the EVA space:
VL.VR
VL = the left part of the vowel bridge, that is, the final glyph of an EVA token.
VR = the right part of the vowel bridge, that is, the first glyph of the next EVA token.
So the visible EVA word boundary lies in the middle of the vowel bridge. The actual boundary of a word or syllable, however, often lies rather between:
C1 || C2
C1 = would then be the final side, that is, the final consonant or final consonant cluster of an element.
C2 = would be the initial side, that is, the initial consonant or initial consonant cluster of the next element.
This
very simple change of perspective suddenly explains a whole series of Voynich peculiarities and statistical anomalies:
1. The repetitions are not necessarily word repetitions
One of the strongest arguments against language has been this: Voynichese contains seemingly absurd repetitions, and their frequency is far above the statistical rate of word repetitions in normal texts:
qokedy qokedy qokedy qokedy
chol chol chol
qokeedy qokeedy qokeedy
daiin daiin daiin
On the EVA word level, this does indeed look almost impossible for a natural language. But in the VBM, these do not necessarily have to be repeated words. They can be repeated stream segments, as I have shown in this thread with several examples.
Example:
qokedy qokedy qokedy qokedy
= e-nd-e-nd-e-nd-e...
This is not absurd in German or Middle High German. Such chains can arise through normal word formation and inflection, for example in structures such as:
"im Elend endenden, den ..." (who end in misery, for)
The point is: the VBM turns a seemingly nonsensical word repetition into a language-like stream. This is because repetitions of glyph sequences within sentences are completely normal.
2. Seven rules for word boundaries become one principle
With seven or eight simple rules, as described by Patrick Feaster, one can account for a very large part of the EVA word boundaries. That would hardly be expected in a normal European language. This is exactly where the VBM comes in.
The VBM reduces these rules, in essence, to a simpler principle: word boundaries are separations of vowel bigrams, VL.VR , where the left side consists of only a small number of glyphs.
And suddenly this fits the picture very well. Vowels are very frequent, and especially "y.qo" as a word-boundary bigram, as an "e", makes a lot of sense.
The actual word boundary often lies here:
C1 || C2 that is, between final consonant and initial consonant.
This also explains why EVA words look so stable and artificial, even though the underlying stream may be language-like.
3. Low entropy becomes less mysterious
Voynichese is unusually predictable on the glyph level. In a normal alphabetic text, this is a problem.
In a slot system, however, this is exactly what one would expect.
If:
C1 constrains C2
C2 constrains VL
VL constrains VR
then the next glyph is often strongly constrained.
That means: the low entropy does not have to speak against language. It can be the result of a mechanical encoding of language.
If vowels are encoded by bigrams, for example:
"y.qo" = "e" then "qo" after y becomes extremely predictable.
If C2 constrains the left part of the vowel bridge, predictability increases further.
If C2 itself works homophonically or positionally, that is, if different consonants or consonant clusters are encoded with similar visible forms, predictability increases again, especially in the e-chains.
The VMS then does not look repetitive because there is no text underneath. It looks repetitive because a natural language stream runs through a very narrow, mechanical slot system.
4. C2 is not a simple letter - the "dy" phenomenon
A central point is the strong relationship between C2 and the final glyph of the token, that is, VL.
In the forum, this has often been described in various forms as the "dy" or "edy" phenomenon, and it has also been described in slot analyses.
Examples: (Only unique C2 clusters were counted)
C2 = eed -> VL = y in 730 out of 732 cases
C2 = ed -> VL = y in 1,332 out of 1,339 cases
Other C2 types behave very differently:
C2 = a -> mainly VL = r or l
C2 = o -> mainly VL = l or r
This is not a free distribution.
C2 and VL apparently belong to the same mechanical system.
This means: C2 cannot be read in isolation.
The unit is more likely:
C2 + VL = route for an initial consonant or initial consonant cluster
VR = vowel value, vowel variant, or selection within the route
This explains why C2 is so difficult to crack. It is probably not a simple substitution table, but an almost polyphonic encryption. This fits perfectly into the period in which the transition between monoalphabetic and polyalphabetic encryption took place.
So one should not only ask: What does eed mean?
but rather:
What does eed.y.qo mean?
What does eed.y.o mean?
What does eed.y.ch mean?
This is more complicated, but it fits the structure of the VMS very well.
5. The connection between C1 and C2 is expected
The connection between C1 and C2 is not a problem for the VBM either. It is exactly what one would expect.
If the real word or syllable boundary lies between C1 and C2,
... V | C1 || C2 | V ...
then the final consonant of one element stands on the left, and the initial consonant of the next element stands on the right.
In real language, such transitions are not completely free. Certain final consonants combine more often with certain initial classes. Other combinations are rare or almost impossible.
If, on top of that, there is also a cipher logic, this coupling becomes even more visible. The C1-C2 dependency is therefore not an argument against the VBM. It follows from the logical consequences of language.
6. e, ee, and eee are not free repetitions
Another well-known finding concerns the e-chains.
A single "e", "ee", "eee", and longer chains do not behave like free repetitions of the same sign.
The preceding anchors change systematically:
single e -> often after ch/sh
ee / eee -> much more strongly after k/t, that is, after Gallows
This is not a free counter.
It looks more like anchor-bound composite forms.
So not:
e = 1
ee = 2
eee = 3
but rather:
che
kee
keee
she
tee
...
as bound slot signs or composite signs.
This fits the VBM because e-runs are almost always token-internal and hardly ever run across EVA spaces. They therefore do not belong to the vowel bridge, but to the internal C2/composite layer.
This separates two levels:
C2/composite layer: token-internal
vowel-bridge layer: across the EVA space
This too fits well into the model.
7. Near-variants become expected
Voynichese is full of very similar forms:
qokedy
qokeedy
qokeey
qokey
okeedy
okeey
On the word level, this looks like an artificially dense vocabulary.
In the VBM, it is expected.
If EVA tokens are only visible cuts through a slot stream, many near-variants automatically arise. This is especially true because it mostly affects C2; the connecting vowel bridges are simply the most frequent plaintext letters, such as "e", while the C1 slots are also frequent plaintext consonants, such as "n".
A different vowel-bridge element, a different C2 block, a shifted cut - and a new EVA token appears that looks very similar to another one.
So this is not an artificial lexicon. It is the result of the VMS using a false word segmentation as camouflage.
8. Hapaxes become less problematic
The VMS has many hapaxes. And many similar words.
This is not surprising in the VBM either.
If EVA words are not plaintext words, but excerpts from a running syllable and morpheme stream, many visible tokens can occur only once, even though their building blocks are very frequent.
German and Middle High German constantly work with short elements such as:
-en
-er
-es / -ez
-end
-de
-den
-der
-se / -ze
etc.
This is where the frequent tokens lie in EVA. But of course there are also many words in which these frequent syllable parts do not occur. And since the system cuts between consonants, many hapaxes must inevitably arise. The distribution between hapaxes and the seemingly similar words of the known families therefore also appears to be a logical consequence of the VBM.
9. LAAFU, LSM, and LEM get a function, and their statistical peculiarities become at least theoretically logical
Line beginnings and line endings are among the most striking problems of the VMS.
In the VBM, line beginnings and line endings are not neutral. And this fits a problem the VMS has:
The running stream has to begin, continue, adapt, or end at a line. Depending on whether the following plaintext begins with a vowel, a consonant, or a cluster, the stream has to be started differently - while still remaining inside the stream.
LSM, meaning Line Start Marker, and LEM, meaning Line End Marker, therefore do not have to be normal words.
They could be phase markers, continuation markers, or boundary markers inside the encoding, indicating for each line how the stream is started in this particular case.
This raises the question why the stream has to be restarted in every line. The answer is simple: because the stream is essentially a forced description of natural language. It often fits, but not always. And unusual words, among other things, disturb it considerably. So it is easier to restart it in every line.
Depending on how it is started, the lines become longer or shorter. And this is one of the statistical peculiarities of LSMs: they lead to longer and shorter lines, depending on what insertions are needed to start the stream.
The change in word length over the course of a line may also be connected with this. If the stream begins at the start of a line in a relatively clean VCCV-like structure, it may become increasingly complicated later in the line through inflection, clusters, endings, and function words. This may explain why EVA word lengths change over the course of a line.
10. Currier A/B becomes less mysterious
Currier A and Currier B differ clearly on the surface. But the underlying slot logic can still remain the same.
That is exactly what one would expect from a table system. Different sections, hands, or conventions may use different routes, while the basic mechanism remains stable. So the VBM does not have to explain Currier A/B away. It can treat Currier A/B as different manifestations of the same mechanical system.
11. The model produces language-like streams
On the EVA word level, the VMS looks alien.
Under VBM segmentation, however, the text begins to look like a consonant/vowel stream:
V - C - C - V - C - C - V ...
That is exactly what one would expect if a natural language were encoded through a position-dependent mechanism.
The perspective shifts:
Not:
EVA word = word
but:
EVA token = visible segment of a running stream
This turns stubborn Voynichese into a system that produces very clear language-like structures, as can be seen perfectly in the translations of the word repetitions.
12. The clue on f8v
On f8v, the plant shows a striking similarity to hazel or hazelnut. On the same page, there is the unusual chain:
okcholksh chol chol chol cthaiin
The strong inner part, under the current VBM working values, produce a stream (probably the only option—I haven't been able to find any others) like:
"nuzze zezzen"
This is remarkable because a German / Middle High German pharmaceutical text contains a very similar formulation:
"hasel-nuzze zezzen"
meaning roughly: to give hazelnuts to eat.
I do not claim that You are not allowed to view links.
Register or
Login to view. is solved by this. But image, structure, and pharmaceutical parallel point in the same direction. This is a strong indication that the VBM works. It is not yet proof - i know.
13. What the VBM cannot yet do
The VBM does not yet provide a complete plaintext.
It does not yet finally clarify which language was encoded, although MHG / NHG are very likely because of their structure.
It does not yet completely clarify how C2 works.
It cannot yet explain why P-lines contain more Gallows than other lines.
But it offers a coherent explanatory framework for many central Voynich peculiarities:
- apparent word repetitions
- strange EVA spaces
- low entropy
- slot-like word structure
- C1-C2 dependency
- C2-VL dependency
- e/ee/eee as a composite layer
- near-variants
- hapaxes
- LAAFU effects
- line-start and line-end effects
- Currier variation
- language-like V/C streams beneath the surface
- possible botanical and textual cribs
The VBM should not simply be seen as just another reading attempt!
Rather, it is a clear structural model that, with very simple and few assumptions, explains a large part of the statistical peculiarities and gives them a meaning.
To my knowledge, it is one of the strongest structural models for capturing the VMS as a system and a possible language.
Jost