The Voynich Ninja
A One-Page Ledger Method for Generating Voynich-Like Text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: A One-Page Ledger Method for Generating Voynich-Like Text (/thread-5752.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11


A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 16-05-2026

Evidence for Local Copy–Mutation in the Scribe 1 Corpus - note: not peer reviewed.

You are not allowed to view links. Register or Login to view.

Github Repository
You are not allowed to view links. Register or Login to view.

Related paper

Beyond Currier A and B: ED-Defined Folio Regimes and Lexical Continuity in the Voynich Manuscript - note: not peer reviewed.
You are not allowed to view links. Register or Login to view.

That paper is regarding my earlier posts: The oddities of the bigram ED

You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.

So, there you have my work to date. My helmet is on, prepared to duck.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

Ok, so the lack of comments so far tells me that my first post was either way too heavy on research-paper mode and too light on explanation, or I have stunned everyone into silence with my brilliance. I strongly suspect it’s the former.

So here’s the simpler version.

I think I’ve found strong evidence for a copy/mutate system inside the Voynich. This overlaps with a lot of Torsten Timm’s work, but I think it goes further. Most copy/mutate theories look only at nearby previous pages. There is definitely evidence for that. But in a real production environment, that would mean the scribe constantly flipping pages around to use as references. What I’m seeing instead is evidence that the copying and mutation operated at the SHEET level. In other words, the scribe could have had one or two sheets propped up nearby and repeatedly pulled words from them while writing a new page. The current page itself then becomes an additional local source. So the workflow becomes:
  • copy a word from a source sheet
  • slightly modify it
  • copy it again
  • mutate it again
  • reuse words already written on the current page
  • repeat hundreds of times

The interesting part is what happens when you analyze the results.
Here’s You are not allowed to view links. Register or Login to view. from the Zandbergen/Landini transcription.

FOLIO: You are not allowed to view links. Register or Login to view. QUIRE(S): 3 (82)
SHEET(S): 4 (82)
SCRIBE(S): 1 (82)
HAND(S): A1 (82)
CURRIER LANGUAGE(S): A (82)
TOTAL WORD INSTANCES LEN>=3: 75
UNIQUE TOKENS LEN>=3: 60
CORE TOKENS TESTED: 24
SAME-FOLIO ED1 DERIVED TOKENS: 51
PREEXISTING SOURCE CANDIDATES SEARCHED: 2153
ALL PRIOR-FOLIO MATCHES FOUND: 251

SOURCE-SHEET COVER
CORE TOKENS WITH AT LEAST ONE PRIOR-FOLIO MATCH: 24
EXACT SMALLEST SHEET COUNT FOUND: 4
COVER METHOD: exact full cover within max_sheets=6
SELECTED SHEETS:
  quire 1, sheet 1: covers 17 core tokens; adds 17
  quire 2, sheet 2: covers 17 core tokens; adds 4
  quire 1, sheet 4: covers 15 core tokens; adds 2
  quire 1, sheet 3: covers 14 core tokens; adds 1
COVERED CORE TOKENS / COVERABLE CORE TOKENS: 24 / 24
COVERED CORE TOKENS / TOTAL CORE TOKENS: 24 / 24
UNCOVERED COVERABLE CORE TOKENS: 0

RETAINED SOURCE SHEETS AFTER CORE PRUNING
  q1s1
  q2s2

SHEET CLASSIFICATION
  q1s1: +17  [CORE]
  q2s2: +4  [SECONDARY]
  q1s4: +2  [RESIDUE]
  q1s3: +1  [RESIDUE]

RESIDUE TOKENS (NEWLY ADDED BY RESIDUE SHEETS)
  choldy | stripped choldy | f20v:1:8 -> cpholdy | stripped choldy | f4r:8:5 | q1s4 | ED0
  shain | stripped shain | f20v:8:1 -> shain | stripped shain | f4r:3:7 | q1s4 | ED0
  choraly | stripped choraly | f20v:8:2 -> chodaly | stripped chodaly | f3v:6:4 | q1s3 | ED1

RESIDUE CORE-RECHECK
  RESOLVED ED1: choldy | stripped choldy | f20v:1:8 -> sholdy | stripped sholdy | f1r:1:9 | q1s1
  RESOLVED ED1: shain | stripped shain | f20v:8:1 -> shaiin | stripped shaiin | f1r:22:2 | q1s1
  RESOLVED ED2: choraly | stripped choraly | f20v:8:2 -> cthoary | stripped choary | f1r:3:6 | q1s1

RESIDUE SUMMARY
  total: 3
  ED0: 0
  ED1: 2
  ED2: 1
  unresolved: 0

UNRESOLVED RESIDUE DIAGNOSTICS
  none


The page contains:
  • 75 total words
  • 60 unique words
  • 51 same-page ED1 derivations
The analyzer first removes the obvious same-page ED1 mutations to isolate the “core” vocabulary of the page.
It then searches the earlier manuscript for possible source matches and tries reducing those matches down into the smallest possible set of source sheets.
Result:
  • 17 core tokens trace back to quire 1, sheet 1
  • a few others come from small secondary sheets
  • and even the leftover “residue” words eventually collapse back to q1s1 through ED1 or ED2
So despite the page looking diverse on the surface, most of the vocabulary ecology reduces back into a very small packet of source material centered around q1s1/f1r.


The manuscript may not have been generated from a giant hidden plaintext or complex cipher system at all. It may instead have been built recursively from a small rolling ecology of existing words, copied and slightly mutated over time while constrained by a simple glyph-adjacency ledger.

And that brings up part 2: the ledger.

It is basically a Voynich word validator.
Very simplified, it has 4 columns:
  • the Voynich glyph
  • allowed prefix followers
  • allowed midfix followers
  • allowed suffix followers
The ledger is built by looking at all the words on Scribe 1 pages and recording where glyphs are allowed to occur and what tends to follow them.

   

So if you wanted to create or validate a word:
Start with F.
The ledger says A is a valid prefix follower.
Then A allows C as a midfix.
C allows H.
H allows Y.
Y allows S as a suffix.
F → a → c → h → y → s
You just created a legal Voynich-style word.

For a mutation, you follow the ledger and
F → a → c → h → y → a → s
and
F → a → c → h → a → s
is a valid word.

The real ledger is more complicated because each follower also has weighting attached to it. Some transitions are common, some are rare. That is how my generator validates mutations when it creates them.  Copy a word from a source sheet, mutate one glyph, check whether the result is still legal according to the ledger, and if it is, the new word survives.  There is obviously more going on than this, but that is the basic mechanism. From what I’m seeing so far, most Scribe 1 pages appear reducible to copy/mutate behavior from a single dominant source sheet, occasionally two, and only rarely three.

So that’s the theory in a nutshell:

Copy/mutate using sheets as the source while constrained by a glyph-adjacency ledger.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 17-05-2026

I can create this post by taking the right words and word parts from your post, but this doesn't mean this is what I did. Does your research show evidence that the Voynich MS can be created using copy and mutate or does it show evidence that the manuscript must have been created this way?


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 17-05-2026

Maybe you can show on a simple example. There is this word ddssShx on the Rosettes folio. Which other words is it based on and which words are copy and mutate derivations of it?


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

(17-05-2026, 05:10 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I can create this post by taking the right words and word parts from your post, but this doesn't mean this is what I did. Does your research show evidence that the Voynich MS can be created using copy and mutate or does it show evidence that the manuscript must have been created this way?

Not that it MUST have been created this way. No. I’m making no claims that this definitively explains the Voynich. What I’m saying is that the manuscript can be reverse engineered into this kind of process, and the results are strong enough that I think it deserves serious consideration as a production method.

I’m not just visually imitating Voynich words. I built an analyzer first, then used the behavior it discovered to build a generator. That generator now reproduces a fair number of Scribe 1 statistical and structural behaviors. Not all of them by any means, but enough that the model is at least plausible. And honestly, if this IS close to the real method, I suspect reproducing the exact Voynich would be impossible anyway. Once you start trying to model human production behavior in Python, you quickly discover how messy humans actually are.

Also, this currently works best for Scribe 1. Scribe 2 is much harder.
The same general approach still partially works on Scribe 2, but Scribe 2 does not collapse neatly into the same compact 1–2 sheet packet structure that Scribe 1 does. That tells me one of two things:
  • either Scribe 2 was produced differently,
  • or the same basic process evolved/drifted into a more complex regime.

I suspect the latter, but I’m not prepared to make a strong claim on Scribe 2 yet because the reduction behavior is nowhere near as clean as Scribe 1.

But, i will say this. I believe the Voynich has LIKELY managed to remain a mystery because a LOT of research has focused on specific sections and not specific scribes.  If you look at Currier and Davis and my work on the bigram ED, they all say that the scribes had very different vocabularies and likely methods. If you mash all that together (Herbal in particular) you're shoving 2-5 methods of production together and trying to extract 1 set of results.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Mauro - 17-05-2026

(17-05-2026, 04:27 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And that brings up part 2: the ledger.

It is basically a Voynich word validator.
Very simplified, it has 4 columns:
  • the Voynich glyph
  • allowed prefix followers
  • allowed midfix followers
  • allowed suffix followers
The ledger is built by looking at all the words on Scribe 1 pages and recording where glyphs are allowed to occur and what tends to follow them.

I don't follow you. What do you mean by 'prefix follower', 'midfix follower',  'suffix follower'? Ie. using the 'f-ledger table' shown in your previous post: if I have 'fa', what can follow the 'a'? 'r,t' (prefix followers) or  'c,i,l,r,n' (midfix fololowers) or 's,t' (suffix followers)? And why?

(17-05-2026, 04:27 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.So if you wanted to create or validate a word:
Start with F.
The ledger says A is a valid prefix follower.
Then A allows C as a midfix.
C allows H.
H allows Y.
Y allows S as a suffix.
F → a → c → h → y → s
You just created a legal Voynich-style word.

Always using your ledger, and the same columns I think you used, what forbids me from creating/validating 'fandao', which does not look much to be a Voynich-style word?


(17-05-2026, 04:27 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.For a mutation, you follow the ledger and
F → a → c → h → y → a → s
and
F → a → c → h → a → s
is a valid word.

Yet again I don't understand. If you find 'fachyas', why does it matter that 'fachas', without the 'y' is a valid word? And, are really 'fachyas'/'fachas' two valid Voynichese words? There's only a word in the full text which starts with 'fach': fachys.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

(17-05-2026, 05:17 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Maybe you can show on a simple example. There is this word ddssShx on the Rosettes folio. Which other words is it based on and which words are copy and mutate derivations of it?

That specific word, no.  Reason being is in the transcriptions I used it's identified as a splat.  I specifically strip any splat word out of my working corpus because there are at least as many ways to define splats as there are splats.  So. That word. No.  And, that is Scribe 2.  I have focused on Scribe 1 and have yet to figure out how Scribe 2 switched regimes and created what they did.  However, if you go to my repo, the analyzer will tell you where it THINKS those sources are.  Run it with core assignments and cluster details on.

f86v6

Code:
FOLIO: f86v6
QUIRE(S): 14 (482)
SHEET(S): 1 (482)
SCRIBE(S): 3 (482)
HAND(S): B3 (482)
CURRIER LANGUAGE(S): unknown
TOTAL WORD INSTANCES LEN>=3: 422
UNIQUE TOKENS LEN>=3: 266
CORE TOKENS TESTED: 84
SAME-FOLIO ED1 DERIVED TOKENS: 338
PREEXISTING SOURCE CANDIDATES SEARCHED: 13916
ALL PRIOR-FOLIO MATCHES FOUND: 1404

SOURCE CLUSTER SUMMARY
  matched source sheets: 40
    quire 14, sheet 1: 95 matches
    quire 9, sheet 1: 78 matches
    quire 13, sheet 5: 78 matches
    quire 11, sheet 1: 77 matches
    quire 13, sheet 2: 68 matches
    quire 13, sheet 1: 63 matches
    quire 13, sheet 3: 60 matches
    quire 10, sheet 1: 59 matches
    quire 13, sheet 4: 53 matches
    quire 6, sheet 3: 48 matches
  matched source folios: 168
    f76r (quire 13, sheet 2): 28 matches
    f58r (quire 8, sheet 2): 25 matches
    f85r1 (quire 14, sheet 1): 24 matches
    f66r (quire 8, sheet 1): 21 matches
    f79r (quire 13, sheet 5): 21 matches
    f80v (quire 13, sheet 5): 20 matches
    f84r (quire 13, sheet 1): 20 matches
    f86v5 (quire 14, sheet 1): 20 matches
    f58v (quire 8, sheet 2): 19 matches
    f79v (quire 13, sheet 5): 19 matches
    f82v (quire 13, sheet 3): 19 matches
    f70r2 (quire 10, sheet 1): 18 matches
    f80r (quire 13, sheet 5): 18 matches
    f85r2 (quire 14, sheet 1): 18 matches
    f76v (quire 13, sheet 2): 17 matches

SOURCE-SHEET COVER
CORE TOKENS WITH AT LEAST ONE PRIOR-FOLIO MATCH: 82
BEST-EFFORT SHEET SET FOUND: 3
COVER METHOD: no exact full cover within max_sheets=6; capped greedy result shown
SELECTED SHEETS:
  quire 14, sheet 1: covers 40 core tokens; adds 40
  quire 8, sheet 2: covers 33 core tokens; adds 13
  quire 13, sheet 2: covers 36 core tokens; adds 8
COVERED CORE TOKENS / COVERABLE CORE TOKENS: 61 / 82
COVERED CORE TOKENS / TOTAL CORE TOKENS: 61 / 84
UNCOVERED COVERABLE CORE TOKENS: 21
  alshdr, chcphar, chokain, cholkar, dairal, dairody, dytshy, lshechy, okeeeykeey, okeockhey, olkshed, opoly, orchcthy, orom, otolkshy, otyteeodaiin, qotardam, qotchdaiin, sarar, shckhor, shoifhy

RETAINED SOURCE SHEETS AFTER CORE PRUNING
  q14s1
  q8s2
  q13s2

SHEET CLASSIFICATION
  q14s1: +40  [CORE]
  q8s2: +13  [CORE]
  q13s2: +8  [SECONDARY]

RESIDUE TOKENS (NEWLY ADDED BY RESIDUE SHEETS)
  none

RESIDUE CORE-RECHECK
  none

RESIDUE SUMMARY
  total: 0
  ED0: 0
  ED1: 0
  ED2: 0
  unresolved: 0

UNRESOLVED RESIDUE DIAGNOSTICS
  none

PARENT DISTANCE HISTOGRAM (ASSIGNED PARENTS ONLY)
  dist  3:    5 (  8.9%)
  dist  6:    1 (  1.8%)
  dist  7:    3 (  5.4%)
  dist  20:    3 (  5.4%)
  dist  21:    11 ( 19.6%)
  dist  42:    1 (  1.8%)
  dist  56:    7 ( 12.5%)
  dist  57:    25 ( 44.6%)



RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

(17-05-2026, 05:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.I don't follow you. What do you mean by 'prefix follower', 'midfix follower',  'suffix follower'? Ie. using the 'f-ledger table' shown in your previous post: if I have 'fa', what can follow the 'a'? 'r,t' (prefix followers) or  'c,i,l,r,n' (midfix fololowers) or 's,t' (suffix followers)? And why?

Always using your ledger, and the same columns I think you used, what forbids me from creating/validating 'fandao', which does not look much to be a Voynich-style word?

Follower may be a bad choice except for the prefix follower. So it's like this.

Your alphabet is on the left column.  That's what a word can start with, any letter in the Voynich alphabet.  The prefix follower.  Only certain letters can come after that specific prefix.  For example.  If you start your word with the letter A, the only 2 options you have for the next letter are R and T.   If you start with the letter F, only A can be the next letter, it follows the prefix.  Once you have FA... then you can look at midfix.  A can have C, L, I, N, R as a letter that follows it as a midfix.  If you select C then you have FAC.  Next, look at what midfix you can add after the letter C.  H, K or T.  Keep adding midfixes until you're ready to end the word.  When you are ready, you'll see that the letter Y has D, E, H, K, L, O, R, S, T as  possible suffixes.

So yes, 
Fachys
Fachyd
Fachye
Fachyh... etc... are all valid words.

That ledger I showed a picture of is not complete. It was designed to be short and simple. Whether you could create the word fandao depends on whether you could follow those letters in the ledger.  If you could, it would be a 'legal' word.  Does that mean it's a Voynich word?  No.  Not one you're familiar with.   The purpose of the ledger is to validate mutations with known letter combinations the Voynich actually uses.

(17-05-2026, 05:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.Yet again I don't understand. If you find 'fachyas', why does it matter that 'fachas', without the 'y' is a valid word? And, are really 'fachyas'/'fachas' two valid Voynichese words? There's only a word in the full text which starts with 'fach': fachys.

The ledger is merely saying that if you want to mutate the word fachyas into fachas then there are examples in the Voynch where Y follows the letter H and if you remove the Y, there are examples in the Voynich where A follows the letter H.  If you tried to create Fachvas, the ledger does not allow that. Which means, nowhere in the Voynich does V follow the letter H.  So, when MUTATING words, or creating random new ones, it constrains the choices to what it knows the Voynich does and doesn't allow really wierd words to appear.

And the ledger will work for any language.  I could load it up with emoji's.  It's constraining the choices made when deciding which letter CAN come after a specific letter. That does not mean it can reproduce a language.

I have a fully built Scribe 1 ledger as a json on the repo.  Have a look.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 17-05-2026

(17-05-2026, 05:52 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Can you give me the EVA for that word.  searching for ddss shows nothing in either Takahashi or ZL.

In ZL it's:

<fRos.14,@L0>    <!2:11>[d:?]dsschx

However looking at it I'd say it's more like ddssShx. In the copy+mutate theory how did this word come to be?

   


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 17-05-2026

(17-05-2026, 05:52 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Reason being is in the transcriptions I used it's identified as a splat.  I specifically strip any splat word out of my working corpus because there are at least as many ways to define splats as there are splats.

What is the definition of a splat here?