The Voynich Ninja
A One-Page Ledger Method for Generating Voynich-Like Text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: A One-Page Ledger Method for Generating Voynich-Like Text (/thread-5752.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 22-05-2026

(22-05-2026, 05:07 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.No, I meeeeeeannnnnnnnnnnn youu youu youu youu youu need to youu need to make surrreeee it doesn't look stupid.

Does "qokeedy.qokeedy.qokedy.qokedy.qokeedy.ldy" look smart? What about "qokeedy.qotedy.qokeedy.qokeedy.qokeey.s,aiin.al"? Or "pShdy.ofchdy.qokedy.qoteedy.qokedy.qoltedy.qotedy.oky"?


RE: A One-Page Ledger Method for Generating Voynich-Like Text - DG97EEB - 22-05-2026

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo


RE: A One-Page Ledger Method for Generating Voynich-Like Text - ReneZ - 22-05-2026

(22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.our estimate for the theoretical space pretty accurate. I calculated 203 possible variants.  But your ~10 alternatives is underestimating.

Fair enough. When oversimplifying things, they become oversimplistic...

How many variants will cover for the most frequent 95%?
I guess the top ten won't do it, but it will be relatively close.

(22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And they're not evenly distributed.

Very distinctly uneven. This is also important, though it will be hard to say more than that this indicates the existence of rules.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026

(22-05-2026, 10:40 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(22-05-2026, 05:07 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.No, I meeeeeeannnnnnnnnnnn youu youu youu youu youu need to youu need to make surrreeee it doesn't look stupid.

Does "qokeedy.qokeedy.qokedy.qokedy.qokeedy.ldy" look smart? What about "qokeedy.qotedy.qokeedy.qokeedy.qokeey.s,aiin.al"? Or "pShdy.ofchdy.qokedy.qoteedy.qokedy.qoltedy.qotedy.oky"?

Alcohol has this way of making you forget the rules, doesn't it?  Having a Moosehead and some Jager so I speak from experience.

I incorporated don't look stupid rules based on Scribe 1.  I didn't say that all the Voynich scribes subscribed to the same rules.  You copy and paste 30,000 words and you might get a tad thirsty.  Plus, those are <ed> so that has to be Scribe 2+.  And they did not play by Scribe 1 rules.  Scribe 1 seems to have been much more alchohol tolerant.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026

(22-05-2026, 10:47 PM)DG97EEB Wrote: You are not allowed to view links. Register or Login to view.Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

You cheated Ed. You used capitals!  That's a bunch of *cough* bull!


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026

(22-05-2026, 11:56 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
(22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.our estimate for the theoretical space pretty accurate. I calculated 203 possible variants.  But your ~10 alternatives is underestimating.

Fair enough. When oversimplifying things, they become oversimplistic...

How many variants will cover for the most frequent 95%?
I guess the top ten won't do it, but it will be relatively close.

(22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And they're not evenly distributed.

Very distinctly uneven. This is also important, though it will be hard to say more than that this indicates the existence of rules.

For just chedy and it's variants?

Transcription ED1 variants Total ED1-neighbor tokens Variants needed for 95%
Takahashi54183519
Zandbergen/Landini53188419

Top 10 variants of chedy coverage.

Transcription Top 10 coverage
Takahashi83.8%
Zandbergen/Landini83.9%

After running the test to answer your question, I decided to run a much bigger test on the entire corpus without specifying a family.  I haven't run this test before so this kinda amazes me.

Transcription Vocabulary ED1 Components Largest ED1 Component % Vocabulary in Largest % Tokens in Largest
Takahashi6813700607789.2%97.8%
Zandbergen/Landini7604976658986.7%97.3%

Almost the entire Voynich running text belongs to one enormous ED1 mutation network. The manuscript does not explore all theoretical mutations equally. But the vocabulary remains mutation-connected almost everywhere.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 23-05-2026

(23-05-2026, 01:49 AM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Scribe 1 seems to have been much more alchohol tolerant.

I'm not sure hand 1 is much better at not looking stupid:

<f15v.5,+P0>      otchor.chor.chor.ytchor.cthy.s
<f42r.13,+P0>    qopor.shol.shot.shol.shol.daiin.dain.s.<->cheam
<f42r.21,+P0>    shol.chol.chol.shol.{ct}oiin.{c'o}s.odan
<f44r.9,+P0>      otchol.ol,dchckhy.qoky.qotchy.qokchy.qokyd
<f47r.7,+P0>      schesy.kchor.cthaiin.chol.chol.chol.chor.{ck@191;h}ey
<f54r.10,+P0>    tor.ol.dol.or.chol.chol.ckhol.okol.oky.<->ytchor.ol,koldy


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026

(22-05-2026, 11:56 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.How many variants will cover for the most frequent 95%?

And I had codex put together a python file using my previous test that gives this all in visual representation.

Here's the chedy network of ED1.

   

Here's the daiin ED1 network.

   

And here's the big network.  The group off to the right, I believe is a chckhy group and I'm thinking only has a weak link to the big network and not sure just yet what the isolated one on the left is.

   

And here's the report the file produced.  I ran it using Zandbergen/Landini.

Total vocabulary: 7582

Total tokens: 36105
Number of ED1 components: 976
Largest component size: 6567
Percent vocabulary in largest component: 86.61%
Percent tokens in largest component: 97.16%
Top 20 largest components by forms and token coverage:
    1:  6567 forms ( 86.61%),  35080 tokens ( 97.16%), top: daiin, ol, chedy, aiin, shedy, chol, ar, or, chey, dar
    2:    8 forms (  0.11%),      8 tokens (  0.02%), top: ckheckhy, ckhockhy, ckhocthy, cpheckhy, cphecthy, cpheocthy, cphocthy, cphoithy
    3:    3 forms (  0.04%),      3 tokens (  0.01%), top: aiinod, aiinos, aiios
    4:    3 forms (  0.04%),      3 tokens (  0.01%), top: oraiinam, otaiikam, otaiinam
    5:    3 forms (  0.04%),      3 tokens (  0.01%), top: otolpchy, stolpchy, tolpchy
    6:    3 forms (  0.04%),      3 tokens (  0.01%), top: pdsairy, polairy, posairy
    7:    3 forms (  0.04%),      3 tokens (  0.01%), top: qotokody, qotomody, shotokody
    8:    2 forms (  0.03%),      2 tokens (  0.01%), top: chedaiphy, chekaiphy
    9:    2 forms (  0.03%),      2 tokens (  0.01%), top: cheedals, cheedls
    10:    2 forms (  0.03%),      2 tokens (  0.01%), top: cheoeees, cheoiees
    11:    2 forms (  0.03%),      2 tokens (  0.01%), top: choteosam, qoteosam
    12:    2 forms (  0.03%),      2 tokens (  0.01%), top: ctharad, ctharal
    13:    2 forms (  0.03%),      2 tokens (  0.01%), top: dchodees, fchodees
    14:    2 forms (  0.03%),      2 tokens (  0.01%), top: deeaiir, seeaiir
    15:    2 forms (  0.03%),      2 tokens (  0.01%), top: eeesal, eeesaly
    16:    2 forms (  0.03%),      2 tokens (  0.01%), top: lshodair, tshodair
    17:    2 forms (  0.03%),      2 tokens (  0.01%), top: ockhdar, ockhydar
    18:    2 forms (  0.03%),      2 tokens (  0.01%), top: okaifhhy, okaifhy
    19:    2 forms (  0.03%),      2 tokens (  0.01%), top: okairody, orairody
    20:    2 forms (  0.03%),      2 tokens (  0.01%), top: okeearam, okeedaram
Outputs written with prefix: C:\Users\rod\Documents\Voynich\New Generator 2\mappings_ZLZB_ed1_network

And, I ran the same test on my generator output.

Total vocabulary: 5477

Total tokens: 23717
Number of ED1 components: 31
Largest component size: 5356
Percent vocabulary in largest component: 97.79%
Percent tokens in largest component: 98.72%
Top 20 largest components by forms and token coverage:
    1:  5356 forms ( 97.79%),  23414 tokens ( 98.72%), top: chol, or, cthol, shol, chey, shey, cthey, shoy, chor, cfhol
    2:    45 forms (  0.82%),    122 tokens (  0.51%), top: cphodaiils, cfhodaiils, ckhodaiils, chodaiils, cphadoiil, cphadaiil, cphdaiils, cphodaiil, cphodails, cthodaiils
    3:    25 forms (  0.46%),      90 tokens (  0.38%), top: dlocta, dlocka, dlocty, dlocha, locta, ddlqocta, ddocka, dlcta, dloctas, dlqocta
    4:    7 forms (  0.13%),      16 tokens (  0.07%), top: fsholrcho, sholrcho, fsholecho, fsolrcho, psholrcho, psholrco, solrcho
    5:    4 forms (  0.07%),      8 tokens (  0.03%), top: foeochdor, doeochdor, foeochdon, foeochhor
    6:    4 forms (  0.07%),      6 tokens (  0.03%), top: ssoepy, sesoepy, sseeky, ssoeky
    7:    3 forms (  0.05%),      11 tokens (  0.05%), top: csteiin, csteion, csteiein
    8:    3 forms (  0.05%),      8 tokens (  0.03%), top: ksheoldas, psheoldas, fsheoldas
    9:    3 forms (  0.05%),      4 tokens (  0.02%), top: steddr, steddd, stedor
    10:    2 forms (  0.04%),      4 tokens (  0.02%), top: cchadoiil, cchardoiil
    11:    2 forms (  0.04%),      3 tokens (  0.01%), top: koldam, poldam
    12:    2 forms (  0.04%),      2 tokens (  0.01%), top: kshoche, tshoche
    13:    2 forms (  0.04%),      5 tokens (  0.02%), top: psdol, pasdol
    14:    2 forms (  0.04%),      2 tokens (  0.01%), top: pshaiiram, rshaiiram
    15:    1 forms (  0.02%),      1 tokens (  0.00%), top: ckochy
    16:    1 forms (  0.02%),      2 tokens (  0.01%), top: cshofainy
    17:    1 forms (  0.02%),      1 tokens (  0.00%), top: csokey
    18:    1 forms (  0.02%),      1 tokens (  0.00%), top: dcpyr
    19:    1 forms (  0.02%),      1 tokens (  0.00%), top: kolsheeo
    20:    1 forms (  0.02%),      2 tokens (  0.01%), top: ohcthaiin

And, Bram Stoker's Dracula for comparison. Note the largest ED1 component is 33% of the vocabulary compared to Voynich and generated text of 97%+

Total vocabulary: 9246

Total tokens: 154418
Number of ED1 components: 4980
Largest component size: 3058
Percent vocabulary in largest component: 33.07%
Percent tokens in largest component: 81.62%
Top 20 largest components by forms and token coverage:
    1:  3058 forms ( 33.07%),  126042 tokens ( 81.62%), top: the, and, to, of, he, in, that, it, was, as
    2:    20 forms (  0.22%),    725 tokens (  0.47%), top: though, through, thought, brought, thoughts, caught, rough, ought, sought, wrought
    3:    16 forms (  0.17%),      91 tokens (  0.06%), top: stopped, stepped, happen, lapped, slipped, happed, happens, mapped, napped, slapped
    4:    11 forms (  0.12%),      35 tokens (  0.02%), top: bending, winding, bidding, finding, sending, winning, ending, binding, minding, blinding
    5:    10 forms (  0.11%),      25 tokens (  0.02%), top: bringing, sinking, singing, cringing, bringin, clanging, clanking, clinging, ringing, wringing
    6:    10 forms (  0.11%),      88 tokens (  0.06%), top: getting, sitting, setting, ittin, letting, settling, fitting, gettin, sittin, spitting
    7:    10 forms (  0.11%),    366 tokens (  0.24%), top: helsing, telling, helping, rolling, tellin, lolling, tolling, yelling, yelpin, yelping
    8:    9 forms (  0.10%),      31 tokens (  0.02%), top: rising, raising, hiding, riding, aiding, adding, aiming, padding, praising
    9:    9 forms (  0.10%),      70 tokens (  0.05%), top: castle, bottle, battle, castles, cattle, battles, bottles, rattle, rattled
    10:    9 forms (  0.10%),      42 tokens (  0.03%), top: breath, wreath, breathe, wrath, wreaths, breathes, wreathed, breadth, breathed
    11:    7 forms (  0.08%),      36 tokens (  0.02%), top: fierce, piece, pieces, pierced, apiece, fiercer, pierce
    12:    7 forms (  0.08%),    109 tokens (  0.07%), top: looking, lookin, licking, booming, mocking, locking, looming
    13:    7 forms (  0.08%),      42 tokens (  0.03%), top: falling, willing, calling, killing, callin, chilling, filling
    14:    7 forms (  0.08%),      38 tokens (  0.02%), top: handed, landed, candle, handle, candles, handled, handles
    15:    7 forms (  0.08%),      30 tokens (  0.02%), top: putting, cutting, shutting, cuttin, cuttings, jotting, jutting
    16:    7 forms (  0.08%),      9 tokens (  0.01%), top: depite, despite, deity, depity, deputy, despise, despises
    17:    7 forms (  0.08%),      11 tokens (  0.01%), top: shipping, drooping, dropping, dipping, dripping, shopping, tripping
    18:    7 forms (  0.08%),      41 tokens (  0.03%), top: edge, edges, pledged, pledge, ledge, ledger, sledge
    19:    7 forms (  0.08%),      11 tokens (  0.01%), top: humble, fumbled, humbly, mumbled, stumbled, tumble, tumbled
    20:    6 forms (  0.06%),      21 tokens (  0.01%), top: slightly, brightly, lightly, rightly, nightly, tightly

Now, to be completely honest, here's where my generator is not getting it right.  It may be over-regularizing morphological patterns.  Plus the low ED1 component count in my generated text is kinda expected.  The generator tries to not create new words out of thin air.  As a result, the combinatorial rules are so regular that almost any token can be nudged into any other through small edits, which the real Voynich doesn't quite do.  Again, I didn't create the generator to pass this test but, with a bit of refinement, I think it can.

My generator

   

Dracula

   

I have csv files to go with this if anyone wants to dig deeper, just let me know.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 23-05-2026

(23-05-2026, 01:37 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And I had codex put together a python file using my previous test that gives this all in visual representation.

Nice! 

Could you please explain those graphs a bit more?  How should we interpret the lengths of the lines and the sizes of the nodes?

All the best, --stolfi


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026

(23-05-2026, 02:25 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(23-05-2026, 01:37 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And I had codex put together a python file using my previous test that gives this all in visual representation.

Nice! 

Could you please explain those graphs a bit more?  How should we interpret the lengths of the lines and the sizes of the nodes?

All the best, --stolfi

The node sizes are proportional to word frequency. Larger nodes are words that occur more often.  The lines represent ED1 relationships. Two nodes are connected if one form can be transformed into the other by a single insertion, deletion, or substitution.  The actual physical lengths of the lines are not meaningful by themselves. The graph layout uses a spring-force algorithm that tries to pull highly connected regions together while pushing weakly connected regions apart. So clusters that appear close together are generally more densely interconnected through ED1 relationships, while detached clusters have relatively few connections to the rest of the network.

And I'll admit, that's an AI description of the chart and I worked with codex to come up with a mathplotlib chart that made it look readable. The first chart it tried looked like a giant cat hairball with everything in a circle and lines going everywhere.  But, the data is coming from one of my mappings files or a gutenberg text.

So, English fractures into many disconnected morphological islands. My generated text is more center clustered than English but still has the outlying islands.  Voynich instead forms one overwhelmingly dominant connected mutation network.  Essentially, my generator is getting some of it right without trying, but not all of it.  I suspect I haven't figured out enough scribal 'habits' yet.  But, I think this shows the basic method the generator is using is 'plausible', which was my main goal.

Screenshot of the python file so you can tell it's not completely ai slop.