The Voynich Ninja

Full Version: Brainstorming: could VMS words have an underlying 3-part structure?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
(07-12-2025, 09:35 AM)ThomasCoon Wrote: You are not allowed to view links. Register or Login to view.Hello again all!

I don't have a solution to the VMS and probably won't be offering anything of value here, but I wonder if words might rely on a 3-part structure, and was curious if anyone else has looked into the idea?

We know that some letter-clusters are almost always word-initial, like qok. We also know that some are almost always word-final, like aiin or dy. And others like ee very often appear in the middle.

Could all this point towards a three-slot structure, with different clusters assigned to each slot?

The Network Perspective on Voynich Word Structure

Your observation about a word structure aligns with documented positional constraints (prefixes like <qok->, middles like <ee>, and endings like <aiin> or <dy>). However, there's a complementary perspective that reveals deeper organizational principles: viewing Voynich words as nodes in a highly interconnected similarity network.

Rather than words being constructed from strict positional slots, the Voynich manuscript exhibits a continuous web of similarity relationships with remarkable properties:

Exceptional Connectivity: Network analysis shows that 84.67% of all Voynich words (6,796 out of 8,026) connect through single-glyph differences—what I call an "edit distance of 1." The longest path through this network spans 21 steps, demonstrating how thoroughly interconnected the vocabulary is (see You are not allowed to view links. Register or Login to view.).

Spatial Clustering: Words that are structurally similar tend to appear close together in the manuscript. As Figure 4 in our paper demonstrates, the average string distance between tokens increases with line distance—meaning similar words cluster on the same pages and even within nearby lines, creating localized families of related forms.

The network perspective explains several features that a rigid 3-slot framework doesn't fully capture:
  • Frequency-similarity correlation: High-frequency words possess more similar variants. For instance, <daiin> (863 occurrences) has 36 counterparts differing by just one glyph, while isolated words typically appear only once (Figure 3).
  • Page-level word families: On individual folios, the most frequent tokens often differ by single glyphs. For example, on f108r, the top three words are <qokeedy>, <qokedy>, and <okedy>—each appearing 16 times.
  • Language evolution: The gradual shift from "Currier A" to "Currier B" reflects accumulating variants rather than a sudden switch between systems. Table 2 shows how reordering sections by <chedy> frequency reveals smooth evolution rather than binary division.
  • Self-citation mechanism: My paper proposes a generation algorithm—copying nearby words and applying small modifications—naturally produces this network structure without requiring explicit slot-based rules.

Positional constraints certainly exist (certain glyphs strongly prefer word-initial or word-final positions), but the network view reveals the generative process: variants spawn from variants through incremental modifications, creating a similarity web where words are related neighbors rather than independent slot combinations. The network approach describes Voynich words less as "slot-filling" and more as organic growth through variation—each new word emerges from existing forms through small transformations, maintaining family resemblances while exploring the allowable glyph-sequence space.

A gephi project for the complete Voynich network is available at You are not allowed to view links. Register or Login to view.
[attachment=12858]
(07-12-2025, 10:30 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(07-12-2025, 09:35 AM)ThomasCoon Wrote: You are not allowed to view links. Register or Login to view.Hello again all!

I don't have a solution to the VMS and probably won't be offering anything of value here, but I wonder if words might rely on a 3-part structure, and was curious if anyone else has looked into the idea?

We know that some letter-clusters are almost always word-initial, like qok. We also know that some are almost always word-final, like aiin or dy. And others like ee very often appear in the middle.

Could all this point towards a three-slot structure, with different clusters assigned to each slot?

The Network Perspective on Voynich Word Structure

Your observation about a word structure aligns with documented positional constraints (prefixes like <qok->, middles like <ee>, and endings like <aiin> or <dy>). However, there's a complementary perspective that reveals deeper organizational principles: viewing Voynich words as nodes in a highly interconnected similarity network.

Rather than words being constructed from strict positional slots, the Voynich manuscript exhibits a continuous web of similarity relationships with remarkable properties:

Exceptional Connectivity: Network analysis shows that 84.67% of all Voynich words (6,796 out of 8,026) connect through single-glyph differences—what I call an "edit distance of 1." The longest path through this network spans 21 steps, demonstrating how thoroughly interconnected the vocabulary is (see You are not allowed to view links. Register or Login to view.).

Spatial Clustering: Words that are structurally similar tend to appear close together in the manuscript. As Figure 4 in our paper demonstrates, the average string distance between tokens increases with line distance—meaning similar words cluster on the same pages and even within nearby lines, creating localized families of related forms.

The network perspective explains several features that a rigid 3-slot framework doesn't fully capture:
  • Frequency-similarity correlation: High-frequency words possess more similar variants. For instance, <daiin> (863 occurrences) has 36 counterparts differing by just one glyph, while isolated words typically appear only once (Figure 3).
  • Page-level word families: On individual folios, the most frequent tokens often differ by single glyphs. For example, on f108r, the top three words are <qokeedy>, <qokedy>, and <okedy>—each appearing 16 times.
  • Language evolution: The gradual shift from "Currier A" to "Currier B" reflects accumulating variants rather than a sudden switch between systems. Table 2 shows how reordering sections by <chedy> frequency reveals smooth evolution rather than binary division.
  • Self-citation mechanism: My paper proposes a generation algorithm—copying nearby words and applying small modifications—naturally produces this network structure without requiring explicit slot-based rules.

Positional constraints certainly exist (certain glyphs strongly prefer word-initial or word-final positions), but the network view reveals the generative process: variants spawn from variants through incremental modifications, creating a similarity web where words are related neighbors rather than independent slot combinations. The network approach describes Voynich words less as "slot-filling" and more as organic growth through variation—each new word emerges from existing forms through small transformations, maintaining family resemblances while exploring the allowable glyph-sequence space.

A gephi project for the complete Voynich network is available at You are not allowed to view links. Register or Login to view.

I’m new to this, and not really that great with linguistics either but I decided to read through the VMS today out of curiosity, just looking for whatever stands out, such as any symbolic relations between what’s depicted and the writing, under the assumption that even if the words themselves were nonsense, perhaps there could still be a symbolic link between like what plant was being shown and give clues towards the writing surrounding it. I first noticed that 8a,,n was coming up a ton with other short ‘words’ beginning with 8 (apologies I don’t know how to write the correct glyph forms or notation yet). I kinda assumed they might be something like ‘and’ ’to’ ‘but’ and other connection words like that. But secondly, what really stood out to me was exactly the whole family based glyph stuff you mentioned! I first noticed it on the sunflower page there were all those, I think you call them \n but I just called them like hooks/sickles. It’s probably just a coincidence but the sunflower’s leaves were sorta hook shaped like the glyph. I didn’t really notice any other connections except for the way words tended to repeat with a little bit of change with a word in between them, but I was suspicious towards how suddenly the ‘family’ would change and it’d be the most prevalent writing form. And there was a lot of repeats that were roughly in the same column too. I guess I was thinking that if it was a psychosis/schizophrenic type of thing where they thought they could speak a divine language, then it wouldn’t be impossible for the things being written about to also affect the writing visually in a way, and kinda explain the local fixations towards repetition and the families of glyphs. Though I am just now learning it goes a bit deeper than that since it’s pretty complex ? I got the impression in the zodiac and nymph parts that it was maybe trying to describe the heavenly realm where those nymphs I guess live, perhaps as an allegory to Eve and/or garden of Eden, and if it was then the plant thing would tie into it, and if those plants were not from Earth but heaven/garden of Eden then it would some light on their artwork being so..otherworldly. 

Anyway sorry for the yap! Just happy that I wasn’t going crazy with noticing local patterns of words that would show up and become the dominant family in the writing for a bit. This isn’t really my forte, I’m really just a biochemist with not much of a background in this but feel free to correct me if I understood wrong.
(07-12-2025, 10:30 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.The Network Perspective on Voynich Word Structure

If that’s true, it means – if I understand correctly – that such a mechanism is structurally incompatible with
a coded Latin text,
a phonetic transcription,
a natural language,
a magical litany,
or any text that carries information at the word level. Am I right?

But the whole thing can’t be a hoax either – that’s just as nonsensical. How incredibly ingenious would a forger have to be to maintain such a system over 200 pages? No one in the 15th century could have planned or controlled such a complex pattern. The effort would be completely disproportionate for a simple fraud – it makes no sense at all.

(In principle, we’d have to postulate something like an “extraterrestrial” text, and honestly, that’s even crazier than anything else – especially since I’d hope extraterrestrials could draw better Big Grin )

There has to be a mistake somewhere, because this is too far removed from any plausible reality of the 15th century. Maybe I'm just misunderstanding something. and there's no question that's entirely possible.
Pages: 1 2