| Welcome, Guest |
You have to register before you can post on our site.
|
|
|
Voynich-Manuskripts mittels 80/20-Matrix https://doi.org/10.5281/zenodo.18715735 |
|
Posted by: Denny92 - 21-02-2026, 02:17 AM - Forum: Theories & Solutions
- Replies (3)
|
 |
Der Balkan-Kodex: Strukturelle Dekodierung des Voynich-Manuskripts mittels 80/20-Matrix.
Autoren/Urheber
van Gulik, Denny
Beschreibung
ENTHÜLLUNG: Der Balkan-Kodex
Titel: Strukturelle Dekodierung des Beinecke MS 408 (Voynich-Manuskript) mittels der 80/20-Matrix
Autor: Denny van Gulik
Datum: Februar 2026
Ort: Gnarrenburg, Niedersachsen
1. Forschungsgegenstand
Die vorliegende Arbeit präsentiert die vollständige Entschlüsselung des Voynich-Manuskripts (MS 408). Im Gegensatz zu bisherigen kryptographischen Versuchen nähert sich diese Untersuchung dem Kodex aus einer bautechnisch-strukturellen Perspektive. Das Manuskript wird nicht als Geheimschrift, sondern als ein funktionales Fachbuch betrachtet, dessen Sprache einer systematischen Deformation unterliegt.
2. Methodik: Die 80/20-Matrix
Der Schlüssel zur Dekodierung liegt in der Identifizierung einer spezifisch linguistischen Zusammensetzung, die hier als 80/20-Matrix definiert wird:
80 % phonetisch deformiertes Latein: Fachbegriffe der mittelalterlichen Botanik und Medizin, die durch Lautverschiebung und bewusste Buchstabendeformation (EVA-Zeichensatz) unkenntlich gemacht wurden.
20 % Balkan-Regionalismen: Verwendung von Begriffen aus dem balkanischen und südosteuropäischen Raum (z. B. Amum für Wasser, Otlar für Kräuter, Pala für Hof/Palast), die als Brückenvokabeln fungieren.
3. Kernergebnisse der Untersuchung
Die Analyse von 246 Seiten belegt zweifelsfrei, dass es sich um das interne Handbuch einer medizinisch-pharmazeutischen Bruderschaft handelt.
Der institutionelle Rahmen: Die wiederkehrenden Begriffe Palar (Palast/Hof) und Frater (Brüder) deuten auf eine hochgradig organisierte Forschungsgruppe hin, die im Schutz eines Herrscherhauses agierte.
Inhaltliche Schwerpunkte: Dokumentation der thermischen Extraktion (Pokedum), Konservierungsmethoden in Honig (Melle) und komplexe balneologische Anlagen zur hydrotherapeutischen Behandlung (Thermalbäder).
Statistische Validität: Die Konsistenz der 80/20-Matrix über den gesamten Korpus von 246 Seiten schließt die Theorie eines „Sinnlostextes“ (Hoax) oder einer reinen Chiffre ohne zugrundeliegende Grammatik aus.
4. Bedeutung für die Wissenschaft
Mit der Vorlage dieses Werkes wird das anhaltende Rätsel des MS 408 gelöst. Die Arbeit bietet Philologen, Medizinhistorikern und Botanikern eine lückenlose Übersetzungsgrundlage und eröffnet neue Einblicke in die pharmazeutische Praxis und die logistischen Strukturen gelehrter Gemeinschaften des 15. Jahrhunderts.
|
|
|
| Theory that the final section is a multilingual glossary |
|
Posted by: eggyk - 20-02-2026, 04:43 PM - Forum: Theories & Solutions
- Replies (10)
|
 |
Theory
My speculation is that the recipes section You are not allowed to view links. Register or Login to view. - You are not allowed to view links. Register or Login to view. may be a multilingual glossary. Each subsection (assumed as words beginning with p) would then start by listing the different ways the object is pronounced, with both dialectal differences and language differences. The repeated words -such as "okedy okeedy"- would be similar to eachother because they represent a homophonic representation of the different ways that people call the same object.
An example of what I mean in english is something like this: (written with vowels in IPA to signify my attempt at some different dialects):
Bellis Perennis: Also Deɪzi, Deɪzɪ, Dɛɪzi, also Deese or Deezɛ, rarely called Deɪz aɪs, is found in...
(Bellis Perennis: Also "day-zee", "day-ziih", "dayy-zee", also "Deh-seh" or "Deh-ze", rarely called "Dayz eyes", is found in...)
Malus domestica: Pomme, Poma, also Apfel, Appel, sometimes Malum or mala, is found in....
How this fits with what's known about the VMS
Use Case
The use case for such a glossary is quite straightforward. The author wanted themself -or anyone who could understand the script- to be able to know the different names that people have for various things. If they lived in a fairly multicultural area, or an area with frequent through-traffic, having a knowledge of how certain things are called would be especially useful. I imagine there would be a LOT of overlapping names too (especially with plants and herbs), with one culture differentiating between two similar things where another doesn't.
Such a place could be somewhere like along the trading routes that ran between italy and western europe, with frequent travellers of various tongues. If the author wanted to buy, sell, acquire or find a specific plant for use, knowing that some people say "day-zee" and some say "Deh-Seh" or some people say "pomme" and others "Poma" is probably very useful.
Lower quality parchment/drawing/decoration
If the document was intended to be used, perhaps day to day, outside in the rain, during travels, during preperation of materials or other activities, as opposed to only read in an academic context, it makes sense to use a slightly less expensive material for this. It also could be a reason for the lower quality drawings and colouring. Why waste time making a perfectly decorated manual if theres a good chance it will smudge, or be ruined during the intended use?
I thought I would use this thread to discuss the merits of this theory (which i'm sure is not unique of course) but also to post some things that i've noticed that led me to it. The first of which is a re-transcribing of some of the first lines of subsections using a different alphabet, which i will post immediately under this post.
Looking for signs of this theory in the text
Effects of using a specific transliteration alphabet
When looking for words that are potentially similar to one another, the transliteration alphabet that you use has an effect. For example, EVA k and t look very similar to eachother on paper, yet sounded out in EVA are quite different. The choice of which letters to use is somewhat arbitrary, yet for this task it has an huge effect.
In order to make the transliterated alphabet easier to sound out, I'm adjusting the EVA and using that for these examples. As long as the transliteration is consistent, our choice of specific letter used to represent each symbol doesn't matter for these purposes. This is just to demonstrate the potential properties of the words.
For clarity, I will use BOTH the EVA and my adjusted version in any examples.
The adjustments to EVA and their reasoning
Adjusted EVA: k = tl, t = thl, l = th, y = -us / con-, m = ré / ch = er / sh = ér
The most important changes are to take similar looking Voynichese symbols and assign them letters that are closer in sound than in EVA.
l: based on it sort-of looking like a cursive greek theta ϑ or the letter thorn Þ (which often resembled wynn ƿ and y).
image_2026-02-20_164811643.png (Size: 2.16 KB / Downloads: 263)
I'm using "th" as it's easier to write.
m: looks like r with a flourish, similar to "re" or "te" in some manuscripts. I have chosen "ré" arbitrarily here, with an accented é only for clarity in examples.
k: splitting the gallows into two letters and assigning TL, simply based on it somewhat resembling a TL
t: again splitting the gallows, assigning L the same way as above but interpreting lL instead of TL, making THL
ch: assuming that c is actually e , and the crossbar is a property of h, so ch = eh . h looks like a small cursive r, so "ch" = "er"
sh: same assumptions as "ch", but s = é
q also gives plenty of issues, but for the purposes of this thread I am going to consider q to be a type of contraction, marker or punctuation instead of a plaintext letter. Something like "also, and, +". This is simply an experiment to see if grammar emerges if q is seperated from its word and treated this way.
Example You are not allowed to view links. Register or Login to view. line 30:
EVA: Polshedaiin qokeoy keol chokeol qotedy qoteedy dar raiin shedy qotain oteedy
EVA: Polshedaiin qokeoy keol chokeol qotedy qoteedy dar raiin shedy qotain oteedy
aEVA: Pothéredaiin q otleous tleoth chotleoth q othledus q othleedus dar raiin éredus q othlain othleedus
There are a few things of note here.
1) Words that did not obviously relate to eachother in EVA suddenly seem far more alike. Compare "qokeoy / qoteedy" vs "otleous / othledus". It seems far more likely that some people may say "otleous" and some may say "othledus". Its less likely that someone may say "okeoy" and someone else says "oteedy".
2) Words similar to the constituent parts of "Pothéredaiin" are found after "dar raiin". P-oth-éred-aiin contains "éred" and "oth-aiin", and "éredus" and "othlain" are seen in the sentence. This is probably coincidence, but it's concievable that someone could shorten "pothéredaiin" to "éredus".
3) The first words to not be part of a string of similar repeated words is "dar raiin". The structure is something like (repeated words),(repeated words), dar raiin (slightly different words). I will discuss this further on, but this structure matches other first lines of other subsections.
Comparing first lines of subsections -f108v
EVA
1) Pchedal qokeedar otedy qokeedy lky ltal aiin oteo pcheey otedar am ol
2) Polaiin okedain okal otchedy qokeedy raraiin okeedy qokar qokal dam
3) Pchedaiin okedy otedal lkedeed okedar okeey qoteol lkedy oteo raiin am
4) Pcheor okear sheey qokeey ykeealkey raraiin opsholal shedy oparam oty
5) Polkeedal sheokchey lotedaiin otedy opchedaiin otshedy qotey raiin ol
6) Polshedaiin qokeoy keol chokeol qotedy qoteedy dar raiin shedy qotain oteedy
aEVA
1) Peredath q otleedar othledus q otleedus thtlus thtlath aiin otleo pereeus othledar aré oth
2) Pothaiin otledain otlath othleredus q otleedus raraiin othleedus q otlar q othlath daré
3) Peredaiin otledus othledath thtledeed otledar otleeus q othleoth thtledus othleo raiin aré
4) Pereor otlear éreeus q otleeus contleeathtleus raraiin opérothath éredus opararé otus
5) Pothkeedath éreotlereus thothledaiin othledus operedaiin othléredus q othleus raiin oth
6) Pothéredaiin q otleous tleoth erotleoth q othledus q othleedus dar raiin éredus q othlain othleedus
aEVA with punctuation
1) Peredath: Also otleedar, othledus and otleedus, thtlus, thtlath, aiin otleo pereeus othledar aré oth
2) Pothaiin: otledain, otlath, othleredus and otleedus, raraiin othleedus and otlar and othlath daré
3) Peredaiin: otledus, othledath, thtledeed, otledar, otleeus, also othleoth, thtledus, othleo raiin aré
4) Pereor: otlear, éreeus, also otleeus contleeathtleus raraiin opérothath éredus opararé otus
5) Pothtleedath: éreotlereus, thothledaiin, othledus, operedaiin, othléredus, and othleus raiin oth
6) Pothéredaiin: Also otleous, tleoth, erotleoth, also othledus and othleedus dar raiin éredus and othlain othleedus
There is obviously a lot of work and analysis to go into this, but this far enough for now.
Edit: it seems that posting a reply simply adds it to the OP, oh well
|
|
|
| New Phonetic-Pedagogical Theory: MS 408 as a Parental Survival Guide |
|
Posted by: Rodrigo - 20-02-2026, 09:15 AM - Forum: Voynich Talk
- Replies (2)
|
 |
Hi everyone,
I’ve been studying the manuscript from a human perspective. My theory is that this isn't a complex military code, but a phonetic guide for oral transmission from parents to children.
Key points of my hypothesis: - The "8" symbol: I believe it represents a polarity marker (YES/NO), used as a quick visual cue for a child's behavior or safety.
- Repetitions: These are not errors. They represent the musical rhythm of a voice (like a lullaby or a mantra) to help the child memorize survival advice.
- Initial glyphs: They mark the beginning of a "lesson" or a song.
The manuscript is a tool to preserve a family’s voice and knowledge. I’d love to know if anyone else has looked at the "8" as a simple binary instruction for phonetic teaching.
Best regards!
|
|
|
| Scribes and elaborate gallows |
|
Posted by: Koen G - 19-02-2026, 06:39 PM - Forum: Analysis of the text
- Replies (9)
|
 |
I've been wondering for a while whether there is any connection between LFD's scribes and the use of ornate & special gallows. I just spent some time skimming each page of the manuscript, paying special attention to top lines. This is far from exhaustive, but should be sufficient to start a discussion.
Some preliminary notes:
- I did not pay too much attention to circular diagrams and left Scribe 4 out of the equation. The ornate gallow is more of a thing for paragraphs. There are some clear exception in f57v, with some very loopy gallows. This is the only diagram not done by scribe 4. It is also part of weird Quire 8, and hence cursed.
- Long-P stretching over various glyphs is omnipresent and cannot be used as a discriminating factor.
- Some areas of the MS are more prone to get fancy gallows than others. (Relative scarcity in some pharma and Q20 pages). This may be due to layout, text type or scribal preference.
Observations
- Scribe 1 masters the "bridging gallow", connecting one word with another. I found only one modest instance by scribe 2 (f80r), though there may be more that I overlooked.
- Moving one tier down, we also notice that Scribe 1 likes to add extra loops. This is rare in other scribes. However, notice the top red arrow. Scribe 1 and scribe 3 have the exact same shape here. This cannot be a coincidence.
- One tier down again from the red arrow, scribe 2 comes in with the "twist" (f76v mark), moving the top of the gallows to the right of the legs. Scribe 1 also has twists, but Scribe 2 has his own recognizable varieties.
- One tier down again (f22r, You are not allowed to view links. Register or Login to view. mark), I noted that all scribes have a few lanky, awkwardly looped gallows.
- At the bottom, there are some uniquely embellished ones. Scribe 1 mikes dots. Scribe 2 has the "ribbon" and groups of 3 lines.
- The second red arrow shows another instance where scribe 3 does exactly the same thing as Scribe 1: scallops on top of the long horizontal, dots beneath. (These are only a few folios removed in the current binding! 102r-105r)
Conclusions
Abnormal/ornamental gallow usage has constants throughout the manuscript (long P, taller gallows on top lines), but some scribes leave their own marks. Scribe 1 likes the most variation, with bridging gallows and extra loops. Scribe 2 thinks about going there sometimes, but us much more careful and restrained. Despite having produced a lot of text, scribe 3 does not use many ornamental gallows. When he does, he acts very much like Scribe 1.
Finally, there is also this page, but I'm not even sure which scribe this is supposed to be. The Rosettes foldout is one of those where several scribes were active. I see TWO instances of rare gallows on this one odd page: a modest bridge, and a Scribe 1-style extra loop.
|
|
|
| The ofchdady nymph of Gemini (the Twins) - February 2026 |
|
Posted by: pjburkshire - 19-02-2026, 04:37 PM - Forum: Imagery
- Replies (11)
|
 |
When I look at the nymph in the top-left of f72r2 - Gemini (the Twins), it looks to me like she is standing on grass. What do other people think?
The ofchdady ( ofchdady ) nymph of Gemini (the Twins)
- It looks like she is standing on grass
- It looks like she is standing on a bed of nails
- It looks like she is standing on something else
- She is not standing on anything
- Not sure, can't tell, image too damaged to know
Also, there is something on her left ankle. I don't know if that is a smudge or part of the illustration. If it is part of the illustration, what is it? An ankle bracelet? A bandage?
..........................
Edit: I probably should have made the option as "unidentified" and not "something else". If you think it could be either grass or a bed of nails, I count that as "unidentified" and would mark it as "something else".
|
|
|
| The oddities of the bigram "ed" pt. 3 : It's not just "ed" |
|
Posted by: Dunsel - 19-02-2026, 03:41 AM - Forum: Analysis of the text
- Replies (18)
|
 |
Here are links to my previous posts in this series.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
In those you'll find what leads up to this post.
"ho"mogonized
In my first post, I lead off with the striking chart that everyone seemed to like. I'll start this post the same way.
What you're looking at there is the bigram "ho" compared to "ed" normalized with my 0ed pages on the left, ed+ pages on the right.
Ok, that chart is a bit jagged. Let me bring that into focus.
That is the count of ho and ed normalized by 1000 tokens. The vertical bar represents where my ed0 ends and ed+ begins. These two charts are essentially showing that the bigram ho and the bigram ed are swapping midfix dominance. All of those pages in the ed+ where they do swap is where ed has it's lowest counts. So even pages where ed occurs only a few times, ho is still the dominant midfix.
In this chart, I've sorted by ho/ed count where ho's higest count is on the left, ed's highest count is on the right. Their counts are normalized by page word count. I've shaded the background to show my ed0 in grey, ed+ in blue. Now this chart is striking. And it begs the question, "Is Currier A and B languages and my ed0 and ed+ correct???"
Here's ho compared to ok, which is also a midfix token and shares a similar word count. What's different about ho from other bigrams introduced in 0ed, it's count doesn't increase with the folio word count.
It's density per folio changes.
0ED: 0.264
ED+: 0.056
Here's ed's density for comparison
0ED: 0.00068
ED+: 0.182
ho isn't the only one that changes but it's by far the largest and that 0ed density is exactly what makes it seem completely opposed to ed. There are other changes in bigrams but not nearly this drastic.
In this chart, blue is the 0ed pages, orange the ed pages. What I did for this chart is chop off the unigram prefix and suffix of every word and look at what was left in the middle. I then took that top 100 cores by count and looked at letter count. H and O dominate the 0ed side. And they're very much present in the ed+ side. But e and d see a large change in count. Also notice that the counts of h and e are almost perfectly interchangeable between 0ed and ed+
Now, looking at the top 100 bigrams across both 0ed and ed+ you can easily see where ho got demoted, ed got promoted, along with a lot of bigrams. None of those other bigrams work like ho. None have a doppelganger in the 0ed side. (that I've found)
What Currier Saw & Theory
So, I hope this makes you ask a few questions. I can tell you that I have lots more charts that try to prove why this happened. I've looked into novelty, bigram usage, bigram depletion and just a gob more aspects of these two regimes and I can't come up with one solid answer. All of the tests that I've performed says that nothing stood out. Well, except for one thing. Curriers eyes.
This is going to delve into theory. And I'm not saying I'm right, I'm saying this is what it suggests to me. Please don't be too brutal, my theories change on a daily basis.
I think I've already shown that Currier spotted these differences some 50 years ago and that my ed0/ed+ pretty much aligns with his language A and language B. He had the time, desire and I believe some punch cards, to make spotting this difference easier. I think I've also shown that this is not section specific. There are plenty of Herbal pages with the bigram ed in prominence. So, the question is, why has nobody until Currier spotted this? And why has nobody been able to show this distinction to this degree until now?
Compare these two charts. The first one is split on my ed0 and ed+ The second, is in original folio order.
Had the Voynich been in my 0ed/ed+ order, I "believe" that spotting this regime shift would have happened much earlier. I "believe" it would have been blatantly obvious. 100+ pages you have gobs of ho in the middle of the word and then, it's mostly gone and ed is in the middle of lots of words.
In my first post, I demonstrated how ed pages are on specific sheets (with one low count exception) in the herbal section. If you look at the 2 charts above, had they been in 0ed/ed+ order there would have been over 100 pages (2/5) of the book with no ed and then, it would suddenly appear as a midfix and become dominant. And I demonstrated how these sheets are wrapped around sheets with 0ed (firsts post). By interleaving those sheets, that 100 page gap was cut roughly in half (f26r is where ed begins in folio order).
Now, I'm going to have to freely admit, I did not have the mathematical skills to come up with this. I knew what I was looking for but the math was a bit beyond me. I asked chatGPT to come up with a formula that would allow me to detect visual differences in pages by their text composition. And here's what it came up with and what I plugged into a python script.
So, if you have the mathematical skills, PLEASE confirm this is a valid method. I've looked it over, it makes sense to me but I'm not a mathematician.
GPT:
For each page, we compute five very simple surface features:
- HO density per word
- ED density per word
- Gallows density per word
- Mean token length
- Top-5 bigram concentration (how dominant the most common bigrams are)
Then we build a vector like this for each page:
[1, HO_per_word, ED_per_word, Gallows_per_word, Mean_token_length, Top5_bigram_share]
That leading 1 is just the intercept term.
Then we fit a simple linear regression:
score = w₀ + w₁·HO + w₂·ED + w₃·Gallows + w₄·Length + w₅·Top5
The weights (w’s) are solved with ordinary least squares to best separate:- 0ED pages → target = 0
- ED+ pages → target = 1
And here's the chart sorted in ed0 - ed+ order
So, it is claiming that by just measuring those surface textures with that formula, that it can predict whether a page is 0ed or ed+ with a 89.8% accuracy rate.
Now, here's what the same chart looks like sorted in folio order.
So, here's my theory. The reason this regime shift wasn't detected until Currier was that it was intentionally obscured by shuffling pages around and, by skipping over to other sections (pharma and zodiac) and then, coming back to finish the herbal section. I believe, that if they hadn't shuffled those pages, and if this were written in the 15th century, then anyone with linguistic skills, like an adept scribe or cryptographer, would have spotted it back then.
Now, this shuffling of pages in no way excludes an intentional production process and that the folios are mostly in chronological order or that it was the result of it being rebound once or twice. Right now, my Occam's Razor detector is saying obfuscation.
Conclusion
This concludes my 3 part series on the oddities of "ed". I'm hoping I've given everyone a lot to think about. I have one more series planned and I hope it's going to be no less... erm... informative. It's going to turn the focus back on repair with an attempt to reverse engineer the Voynich.
Thanks for all the great replies and good hunting.
|
|
|
Comprehensive Decipherment of the Voynich Manuscript: A New Linguistic Approach |
|
Posted by: kentalbix - 18-02-2026, 07:27 PM - Forum: The Slop Bucket
- Replies (1)
|
 |
The Albi Tirado Method: A Definitive Solution to the Voynich Script Mystery
Dear Voynich Community,
My name is Ronald Alexander Albi Tirado. Today, February 18, 2026, I am formally presenting the foundational results of my research regarding the decipherment of the Voynich Manuscript (Beinecke MS 408).
I. Methodology
The Albi Tirado Method is based on a systematic linguistic reconstruction that bridges the gap between the "Voynichese" glyphs and their botanical/semantic referents. Unlike previous statistical approaches, my work utilizes a morpho-linguistic anchor.
By identifying the precise botanical species in the illustrations, I have successfully isolated recurring phonetic patterns in the accompanying text. This cross-referencing has allowed for the identification of a stable and consistent syntax throughout the manuscript.
II. Evidence: Analysis of Folio 9v
To demonstrate the functional application of this method, I submit for your review the analysis of Folio 9v.
Taxonomic Correlation: The illustration in Folio 9v has been identified as Viola tricolor (commonly known as the Wild Pansy).
Morphological Evidence: The serrated leaf structure, the tri-colored petal arrangement, and the specific striated primary root depicted in the folio align perfectly with the linguistic labels decoded through my method.
Decipherment Result: The labels surrounding the plant do not represent random characters but precise descriptors of the plant’s properties and classification in a Medieval Romance/Latin dialect.
III. Call for Peer Review
I am officially recording my authorship of this research today. While many have claimed to solve this mystery, the internal consistency of the Albi Tirado Method provides a replicable framework that holds across multiple sections of the codex—including the pharmaceutical and astronomical folios.
I look forward to a rigorous technical discussion with the members of this community. I will be posting further translations and grammatical breakdowns in the coming days.
Respectfully,
Ronald Alexander Albi Tirado
Lead Researcher
February 18, 2026
|
|
|
Amendment to Voynich MS 408: The Syntaxis Volvella |
|
Posted by: PandaRosa - 18-02-2026, 07:14 PM - Forum: The Slop Bucket
- Replies (1)
|
 |
Hi everyone,
I’ve been following the discussions here for a while regarding the rigid morphology of Voynich "words". I approached MS 408 not as a linguist, but from a Forensic Engineering perspective.
My hypothesis was simple: What if the rigidity isn't grammatical, but mechanical?
After mapping the transition probabilities of 35,000 tokens, I have isolated the hardware architecture responsible for generating the text. I call it The Syntaxis Volvella.
![[Image: default.png]](https://zenodo.org/api/iiif/record:18684047:stator_rotor.png/full/!800,800/0/default.png)
I am sharing my findings here because I need this community's critical eye on the data.
1. The Architecture: A 17:13 Differential. The statistical "dead zones" in the text suggest a stator disk with 17 Semantic Sectors (providing the Suffix/Context) interacting with a planetary rotor of 13 teeth (providing the Stem/Root). This specific 17:13 ratio explains the cyclical repetition and the "State-Memory" transitions that purely linguistic models fail to predict.
Volvella architecture JSON:
Code: {
"artifact_designation": "V-2206 Syntaxis Volvella",
"theoretical_basis": "Mechanical Generative System (Non-Linguistic)",
"architecture": {
"stator_unit": {
"description": "Outer fixed disk defining the semantic context (Suffixes)",
"segments": 17,
"sector_topology": [
{"id": "AIIN", "function": "ITEM_LISTING", "mechanical_bias": "High_D_Lock"},
{"id": "IIN", "function": "GENERIC", "mechanical_bias": "None"},
{"id": "IN", "function": "GENERIC", "mechanical_bias": "None"},
{"id": "EEDY", "function": "SYNC_NODE_1", "mechanical_bias": "Cam_Lobe_Contact (QOK)"},
{"id": "HEDY", "function": "PROCESS_DESCRIPTOR", "mechanical_bias": "Hard_C_Lock"},
{"id": "EDY", "function": "SYNC_NODE_2", "mechanical_bias": "Cam_Lobe_Contact (QOK)"},
{"id": "DY", "function": "VARIABLE_INPUT", "mechanical_bias": "Low_Friction"},
{"id": "AM", "function": "GENERIC", "mechanical_bias": "None"},
{"id": "OM", "function": "GENERIC", "mechanical_bias": "None"},
{"id": "OS", "function": "DATA", "mechanical_bias": "CHE_Bias"},
{"id": "US", "function": "GENERIC", "mechanical_bias": "None"},
{"id": "AL", "function": "NULL_SEPARATOR", "mechanical_bias": "Empty_Stem"},
{"id": "AR", "function": "NULL_TERMINATOR", "mechanical_bias": "Empty_Stem"},
{"id": "OL", "function": "NULL_SEPARATOR", "mechanical_bias": "Empty_Stem"},
{"id": "OR", "function": "NULL_TERMINATOR", "mechanical_bias": "Empty_Stem"},
{"id": "EY", "function": "SYNC_NODE_3", "mechanical_bias": "Cam_Lobe_Contact (CH)"},
{"id": "KY", "function": "FRICTION_ZONE", "mechanical_bias": "QO_Bias"}
]
},
"rotor_unit": {
"description": "Inner planetary gear with 3-Lobe Cam geometry",
"gear_ratio_stator_to_rotor": "17:13",
"core_lexicon": ["QOK", "CH", "SH", "OK", "D", "S", "C"],
"synchronization": {
"cam_profile": "Triangular (Eccentric)",
"phase_alignment": ["EEDY", "AIIN", "EY"]
}
},
"interface_unit": {
"description": "Radial Alidade with 3 sighting windows",
"modes": {
"NORTH_WINDOW": {"trigger": ["P", "F"], "content_pool": "RING_A (Consonants)"},
"SOUTH_WINDOW": {"trigger": ["T", "K"], "content_pool": "RING_B (Vowels)"},
"NEUTRAL_WINDOW": {"trigger": "NONE", "content_pool": "RING_C (Rotor)", "usage": 0.85}
}
},
"rings_content": {
"RING_A": {
"description": "Consonants / Hard prefixes (accessed by NORTH_WINDOW)",
"top_teeth_freq": [
["CH", 0.52],
["C", 0.20],
["CHE", 0.12],
["SH", 0.08],
["OL", 0.08]
]
},
"RING_B": {
"description": "Vowels / Soft connectors (accessed by SOUTH_WINDOW)",
"top_teeth_freq": [
["CH", 0.30],
["E", 0.22],
["C", 0.19],
["A", 0.08],
["EE", 0.07]
]
},
"RING_C": {
"description": "Rotor core stems (accessed by NEUTRAL_WINDOW)",
"top_teeth_freq": [
["QOK", 0.24],
["D", 0.23],
["CH", 0.21],
["S", 0.14],
["OK", 0.13]
]
}
}
},
"operational_physics": {
"batch_processing_inertia": {
"description": "Operator tends to stay in the same sector group (data batching)",
"mean_sector_jump": 5.71,
"median_sector_jump": 5.0,
"percentage_repeat_sector": 13.43
},
"stochastic_emission": "Output = P(Stem|Sector) * P(Sector_t+1|Sector_t)",
"mechanical_rigidity_examples": [
{"sector": "HEDY", "forced_tooth": "C", "observed_frequency": 0.30},
{"sector": "AIIN", "forced_tooth": "D", "observed_frequency": 0.30},
{"sector": "EY", "preferred_tooth": "CH", "observed_frequency": 0.16},
{"sector": "EEDY", "preferred_tooth": "QOK", "observed_frequency": 0.17}
]
}
}
2. The QOK Anomaly. This is the strongest physical evidence. In my telemetry analysis, the token QOK is not a word, it’s a mechanical synchronization artifact. It appears with statistical significance at exactly 120-degree intervals on the stator (Sectors equivalent to EEDY, AIIN, EY). This phase-lock strongly implies the internal rotor is driven by a 3-Lobe Triangular Cam.
![[Image: default.png]](https://zenodo.org/api/iiif/record:18684047:Fig2_QOK_Synchronization.png/full/!800,800/0/default.png)
3. The Turing Test (Simulation Results) I wrote a Python script to simulate this hardware. I fed it zero linguistic rules—only the physical constraints of the gears and the probability matrix of the sectors.
Result: The synthetic text matches the real MS 408 with a 99.6% Zipf Law correlation.
![[Image: default.png]](https://zenodo.org/api/iiif/record:18684047:Fig1_Zipf_Comparison.png/full/!800,800/0/default.png)
The complex "language" behavior is actually just the friction and geometry of a machine.
The Paper & Data: You are not allowed to view links. Register or Login to view.
I have uploaded the full breakdown, the Python code, and the CSV telemetry logs to Zenodo. I invite you to audit my code and the "Hard Lock" tables.
I am not claiming to have translated the meaning YET, but I believe I have successfully reverse-engineered the source. I would appreciate your thoughts!!
Regards
Steven Quevedo
|
|
|
| Measuring Long-Range Structure in the Voynich Manuscript |
|
Posted by: quimqu - 18-02-2026, 05:29 PM - Forum: Analysis of the text
- Replies (49)
|
 |
Hello again!
I've taken a break from studying the Voynich for a few months. It's such a complex subject that I think you need to mentally disconnect from it every now and then.
Lately, I've been doing a simple statistical test on the manuscript, which in principle doesn't depend on any linguistic interpretation. The idea is to measure to what extent the identity of a character at position t gives us information about the character at position t+d. In more technical terms, I'm measuring mutual information, which can be calculated for different distances d. If the text has only very local structure, this dependence, however small, should quickly disappear. If there is a deeper structure, some of the dependence should persist even at larger distances.
In the case of the Voynich, mutual information is maintained at a certain level even at distances of 50 to 100 characters (very similar to natural languages). When the same text is globally shuffled, the signal collapses. This seems to confirm that the effect depends on the actual order of the characters and not just their frequencies.
I have also tried a control that preserves local patterns but destroys the global order by shuffling entire lines. In this case, the short-range dependence is maintained, but the behavior at longer distances is lost. This suggests that the signal is not limited to regularities within each line.
To make sure that the result is not just due to the fact that the manuscript has different parts with different letter styles or frequencies, I did a very simple test. I created artificial texts divided into blocks. In each block, the letters appear in the same proportions as in the original text of that part, but they are placed randomly, with no real order.
So the artificial text preserves the slow changes in frequencies between sections, but removes any real structure in the sequence. When I apply the same measurement to these artificial texts, the signal disappears almost completely. This means that the pattern we see in the Voynich cannot be explained simply by the fact that different parts of the manuscript have different letter frequencies. There is more than just variation between sections.
I also trained simple generative models on the Voynich text itself. A 1st-order Markov model captures local transitions but fails to reproduce the structure over longer distances. Moderate-order character n-gram models reproduce short-range effects, but they do not match the persistence observed in the original text.
Importantly, the pattern is robust whether spaces are removed or if one changes from EVA to an alternative transliteration (CUVA). The overall behavior remains qualitatively the same.
For comparison, I have applied the same analysis to several natural language corpora. The Voynich curves fall within the same general range as those of these texts: they do not behave like mixed noise or like sequences generated by simple local models. On a purely statistical level, the Voynich character sequence shows a structured long-range dependence comparable to that of natural texts.
This does not prove that the manuscript encodes a natural language. But I do think it shows that its character sequence behaves like a structured system with persistent long-range dependencies, and not like a mixed or purely local construct.
|
|
|
|