![]() |
||||||||||||||||||||||||||||||||||||
|
A One-Page Ledger Method for Generating Voynich-Like Text - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: A One-Page Ledger Method for Generating Voynich-Like Text (/thread-5752.html) |
||||||||||||||||||||||||||||||||||||
RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 22-05-2026 (22-05-2026, 05:07 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.No, I meeeeeeannnnnnnnnnnn youu youu youu youu youu need to youu need to make surrreeee it doesn't look stupid. Does "qokeedy.qokeedy.qokedy.qokedy.qokeedy.ldy" look smart? What about "qokeedy.qotedy.qokeedy.qokeedy.qokeey.s,aiin.al"? Or "pShdy.ofchdy.qokedy.qoteedy.qokedy.qoltedy.qotedy.oky"? RE: A One-Page Ledger Method for Generating Voynich-Like Text - DG97EEB - 22-05-2026 Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo RE: A One-Page Ledger Method for Generating Voynich-Like Text - ReneZ - 22-05-2026 (22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.our estimate for the theoretical space pretty accurate. I calculated 203 possible variants. But your ~10 alternatives is underestimating. Fair enough. When oversimplifying things, they become oversimplistic... How many variants will cover for the most frequent 95%? I guess the top ten won't do it, but it will be relatively close. (22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And they're not evenly distributed. Very distinctly uneven. This is also important, though it will be hard to say more than that this indicates the existence of rules. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026 (22-05-2026, 10:40 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(22-05-2026, 05:07 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.No, I meeeeeeannnnnnnnnnnn youu youu youu youu youu need to youu need to make surrreeee it doesn't look stupid. Alcohol has this way of making you forget the rules, doesn't it? Having a Moosehead and some Jager so I speak from experience. I incorporated don't look stupid rules based on Scribe 1. I didn't say that all the Voynich scribes subscribed to the same rules. You copy and paste 30,000 words and you might get a tad thirsty. Plus, those are <ed> so that has to be Scribe 2+. And they did not play by Scribe 1 rules. Scribe 1 seems to have been much more alchohol tolerant. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026 (22-05-2026, 10:47 PM)DG97EEB Wrote: You are not allowed to view links. Register or Login to view.Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo You cheated Ed. You used capitals! That's a bunch of *cough* bull! RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026 (22-05-2026, 11:56 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(22-05-2026, 01:31 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.our estimate for the theoretical space pretty accurate. I calculated 203 possible variants. But your ~10 alternatives is underestimating. For just chedy and it's variants?
Top 10 variants of chedy coverage.
After running the test to answer your question, I decided to run a much bigger test on the entire corpus without specifying a family. I haven't run this test before so this kinda amazes me.
Almost the entire Voynich running text belongs to one enormous ED1 mutation network. The manuscript does not explore all theoretical mutations equally. But the vocabulary remains mutation-connected almost everywhere. RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 23-05-2026 (23-05-2026, 01:49 AM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Scribe 1 seems to have been much more alchohol tolerant. I'm not sure hand 1 is much better at not looking stupid: <f15v.5,+P0> otchor.chor.chor.ytchor.cthy.s <f42r.13,+P0> qopor.shol.shot.shol.shol.daiin.dain.s.<->cheam <f42r.21,+P0> shol.chol.chol.shol.{ct}oiin.{c'o}s.odan <f44r.9,+P0> otchol.ol,dchckhy.qoky.qotchy.qokchy.qokyd <f47r.7,+P0> schesy.kchor.cthaiin.chol.chol.chol.chor.{ck@191;h}ey <f54r.10,+P0> tor.ol.dol.or.chol.chol.ckhol.okol.oky.<->ytchor.ol,koldy RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026 (22-05-2026, 11:56 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.How many variants will cover for the most frequent 95%? And I had codex put together a python file using my previous test that gives this all in visual representation. Here's the chedy network of ED1. Here's the daiin ED1 network. And here's the big network. The group off to the right, I believe is a chckhy group and I'm thinking only has a weak link to the big network and not sure just yet what the isolated one on the left is. And here's the report the file produced. I ran it using Zandbergen/Landini. Total vocabulary: 7582 Total tokens: 36105 Number of ED1 components: 976 Largest component size: 6567 Percent vocabulary in largest component: 86.61% Percent tokens in largest component: 97.16% Top 20 largest components by forms and token coverage: 1: 6567 forms ( 86.61%), 35080 tokens ( 97.16%), top: daiin, ol, chedy, aiin, shedy, chol, ar, or, chey, dar 2: 8 forms ( 0.11%), 8 tokens ( 0.02%), top: ckheckhy, ckhockhy, ckhocthy, cpheckhy, cphecthy, cpheocthy, cphocthy, cphoithy 3: 3 forms ( 0.04%), 3 tokens ( 0.01%), top: aiinod, aiinos, aiios 4: 3 forms ( 0.04%), 3 tokens ( 0.01%), top: oraiinam, otaiikam, otaiinam 5: 3 forms ( 0.04%), 3 tokens ( 0.01%), top: otolpchy, stolpchy, tolpchy 6: 3 forms ( 0.04%), 3 tokens ( 0.01%), top: pdsairy, polairy, posairy 7: 3 forms ( 0.04%), 3 tokens ( 0.01%), top: qotokody, qotomody, shotokody 8: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: chedaiphy, chekaiphy 9: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: cheedals, cheedls 10: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: cheoeees, cheoiees 11: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: choteosam, qoteosam 12: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: ctharad, ctharal 13: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: dchodees, fchodees 14: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: deeaiir, seeaiir 15: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: eeesal, eeesaly 16: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: lshodair, tshodair 17: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: ockhdar, ockhydar 18: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: okaifhhy, okaifhy 19: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: okairody, orairody 20: 2 forms ( 0.03%), 2 tokens ( 0.01%), top: okeearam, okeedaram Outputs written with prefix: C:\Users\rod\Documents\Voynich\New Generator 2\mappings_ZLZB_ed1_network And, I ran the same test on my generator output. Total vocabulary: 5477 Total tokens: 23717 Number of ED1 components: 31 Largest component size: 5356 Percent vocabulary in largest component: 97.79% Percent tokens in largest component: 98.72% Top 20 largest components by forms and token coverage: 1: 5356 forms ( 97.79%), 23414 tokens ( 98.72%), top: chol, or, cthol, shol, chey, shey, cthey, shoy, chor, cfhol 2: 45 forms ( 0.82%), 122 tokens ( 0.51%), top: cphodaiils, cfhodaiils, ckhodaiils, chodaiils, cphadoiil, cphadaiil, cphdaiils, cphodaiil, cphodails, cthodaiils 3: 25 forms ( 0.46%), 90 tokens ( 0.38%), top: dlocta, dlocka, dlocty, dlocha, locta, ddlqocta, ddocka, dlcta, dloctas, dlqocta 4: 7 forms ( 0.13%), 16 tokens ( 0.07%), top: fsholrcho, sholrcho, fsholecho, fsolrcho, psholrcho, psholrco, solrcho 5: 4 forms ( 0.07%), 8 tokens ( 0.03%), top: foeochdor, doeochdor, foeochdon, foeochhor 6: 4 forms ( 0.07%), 6 tokens ( 0.03%), top: ssoepy, sesoepy, sseeky, ssoeky 7: 3 forms ( 0.05%), 11 tokens ( 0.05%), top: csteiin, csteion, csteiein 8: 3 forms ( 0.05%), 8 tokens ( 0.03%), top: ksheoldas, psheoldas, fsheoldas 9: 3 forms ( 0.05%), 4 tokens ( 0.02%), top: steddr, steddd, stedor 10: 2 forms ( 0.04%), 4 tokens ( 0.02%), top: cchadoiil, cchardoiil 11: 2 forms ( 0.04%), 3 tokens ( 0.01%), top: koldam, poldam 12: 2 forms ( 0.04%), 2 tokens ( 0.01%), top: kshoche, tshoche 13: 2 forms ( 0.04%), 5 tokens ( 0.02%), top: psdol, pasdol 14: 2 forms ( 0.04%), 2 tokens ( 0.01%), top: pshaiiram, rshaiiram 15: 1 forms ( 0.02%), 1 tokens ( 0.00%), top: ckochy 16: 1 forms ( 0.02%), 2 tokens ( 0.01%), top: cshofainy 17: 1 forms ( 0.02%), 1 tokens ( 0.00%), top: csokey 18: 1 forms ( 0.02%), 1 tokens ( 0.00%), top: dcpyr 19: 1 forms ( 0.02%), 1 tokens ( 0.00%), top: kolsheeo 20: 1 forms ( 0.02%), 2 tokens ( 0.01%), top: ohcthaiin And, Bram Stoker's Dracula for comparison. Note the largest ED1 component is 33% of the vocabulary compared to Voynich and generated text of 97%+ Total vocabulary: 9246 Total tokens: 154418 Number of ED1 components: 4980 Largest component size: 3058 Percent vocabulary in largest component: 33.07% Percent tokens in largest component: 81.62% Top 20 largest components by forms and token coverage: 1: 3058 forms ( 33.07%), 126042 tokens ( 81.62%), top: the, and, to, of, he, in, that, it, was, as 2: 20 forms ( 0.22%), 725 tokens ( 0.47%), top: though, through, thought, brought, thoughts, caught, rough, ought, sought, wrought 3: 16 forms ( 0.17%), 91 tokens ( 0.06%), top: stopped, stepped, happen, lapped, slipped, happed, happens, mapped, napped, slapped 4: 11 forms ( 0.12%), 35 tokens ( 0.02%), top: bending, winding, bidding, finding, sending, winning, ending, binding, minding, blinding 5: 10 forms ( 0.11%), 25 tokens ( 0.02%), top: bringing, sinking, singing, cringing, bringin, clanging, clanking, clinging, ringing, wringing 6: 10 forms ( 0.11%), 88 tokens ( 0.06%), top: getting, sitting, setting, ittin, letting, settling, fitting, gettin, sittin, spitting 7: 10 forms ( 0.11%), 366 tokens ( 0.24%), top: helsing, telling, helping, rolling, tellin, lolling, tolling, yelling, yelpin, yelping 8: 9 forms ( 0.10%), 31 tokens ( 0.02%), top: rising, raising, hiding, riding, aiding, adding, aiming, padding, praising 9: 9 forms ( 0.10%), 70 tokens ( 0.05%), top: castle, bottle, battle, castles, cattle, battles, bottles, rattle, rattled 10: 9 forms ( 0.10%), 42 tokens ( 0.03%), top: breath, wreath, breathe, wrath, wreaths, breathes, wreathed, breadth, breathed 11: 7 forms ( 0.08%), 36 tokens ( 0.02%), top: fierce, piece, pieces, pierced, apiece, fiercer, pierce 12: 7 forms ( 0.08%), 109 tokens ( 0.07%), top: looking, lookin, licking, booming, mocking, locking, looming 13: 7 forms ( 0.08%), 42 tokens ( 0.03%), top: falling, willing, calling, killing, callin, chilling, filling 14: 7 forms ( 0.08%), 38 tokens ( 0.02%), top: handed, landed, candle, handle, candles, handled, handles 15: 7 forms ( 0.08%), 30 tokens ( 0.02%), top: putting, cutting, shutting, cuttin, cuttings, jotting, jutting 16: 7 forms ( 0.08%), 9 tokens ( 0.01%), top: depite, despite, deity, depity, deputy, despise, despises 17: 7 forms ( 0.08%), 11 tokens ( 0.01%), top: shipping, drooping, dropping, dipping, dripping, shopping, tripping 18: 7 forms ( 0.08%), 41 tokens ( 0.03%), top: edge, edges, pledged, pledge, ledge, ledger, sledge 19: 7 forms ( 0.08%), 11 tokens ( 0.01%), top: humble, fumbled, humbly, mumbled, stumbled, tumble, tumbled 20: 6 forms ( 0.06%), 21 tokens ( 0.01%), top: slightly, brightly, lightly, rightly, nightly, tightly Now, to be completely honest, here's where my generator is not getting it right. It may be over-regularizing morphological patterns. Plus the low ED1 component count in my generated text is kinda expected. The generator tries to not create new words out of thin air. As a result, the combinatorial rules are so regular that almost any token can be nudged into any other through small edits, which the real Voynich doesn't quite do. Again, I didn't create the generator to pass this test but, with a bit of refinement, I think it can. My generator Dracula I have csv files to go with this if anyone wants to dig deeper, just let me know. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 23-05-2026 (23-05-2026, 01:37 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And I had codex put together a python file using my previous test that gives this all in visual representation. Nice! Could you please explain those graphs a bit more? How should we interpret the lengths of the lines and the sizes of the nodes? All the best, --stolfi RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 23-05-2026 (23-05-2026, 02:25 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.(23-05-2026, 01:37 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And I had codex put together a python file using my previous test that gives this all in visual representation. The node sizes are proportional to word frequency. Larger nodes are words that occur more often. The lines represent ED1 relationships. Two nodes are connected if one form can be transformed into the other by a single insertion, deletion, or substitution. The actual physical lengths of the lines are not meaningful by themselves. The graph layout uses a spring-force algorithm that tries to pull highly connected regions together while pushing weakly connected regions apart. So clusters that appear close together are generally more densely interconnected through ED1 relationships, while detached clusters have relatively few connections to the rest of the network. And I'll admit, that's an AI description of the chart and I worked with codex to come up with a mathplotlib chart that made it look readable. The first chart it tried looked like a giant cat hairball with everything in a circle and lines going everywhere. But, the data is coming from one of my mappings files or a gutenberg text. So, English fractures into many disconnected morphological islands. My generated text is more center clustered than English but still has the outlying islands. Voynich instead forms one overwhelmingly dominant connected mutation network. Essentially, my generator is getting some of it right without trying, but not all of it. I suspect I haven't figured out enough scribal 'habits' yet. But, I think this shows the basic method the generator is using is 'plausible', which was my main goal. Screenshot of the python file so you can tell it's not completely ai slop. |