The Voynich Ninja

Full Version: A One-Page Ledger Method for Generating Voynich-Like Text
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
I always welcome constructive criticism. It is indeed a weakness of my rule-based combinatorics that unique words and hapax legomena are left out. One could probably add additional rules to counteract this, but the more rules you implement, the less of a practical method for the author can be derived from them. However, it does look somewhat “synthetic” (“too good”)—it lacks the “human factor.” Unfortunately, that cannot be replicated (afaik).
(31-05-2026, 01:21 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Unfortunately, that cannot be replicated (afaik).

Well, here's one thing we know for certain. 1 or more humans created this. How and why is up for debate. But if one human was able to create it, then we should be able to replicate it.
(31-05-2026, 04:02 AM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.But if one human was able to create it, then we should be able to replicate it.

Depending on what you mean by "replicate", if the process was not entirely algorithmic and also meaningless, there is a good chance that we will never be able to say with certainty we have the method
(31-05-2026, 05:22 AM)rikforto Wrote: You are not allowed to view links. Register or Login to view.Depending on what you mean by "replicate", if the process was not entirely algorithmic and also meaningless, there is a good chance that we will never be able to say with certainty we have the method

Here's what I mean by replicate.  This is output from my analyzer which is in the repo in the OP.  I had GPT convert the output to BBCode to format it for this forum.  I'll do 3 more posts with other pages and scribes to show this isn't a 1-off occurence.

This shows the source for every word on f1v.  Only 1 word doesn't come from You are not allowed to view links. Register or Login to view. and only 2 words are edit distance 2.  Note: It only looks at words length >= 3 as length 1 and 2 can be very easily replicated.

Interpretation: This table shows that with the exception of 1 token, all other tokens are a product of copy/mutate from You are not allowed to view links. Register or Login to view. or a copy/mutate from a previously written token on f1v.

If the "algorithmic method" can be figured out for how they did this, then replication (not duplication) will be possible.  And that's what I'm trying to accomplish with my generator.  So far, I haven't found an exact method and, as Torsten Timm has pointed out, it's likely a very human method which is not going to be duplicated in code easily. 

I've stated this before but I'll keep clarifying: I'm not saying this was the method for creating the Voynich.  I'm simply asking if it could have been the method.  And, if copy/mutate + ledger becomes plausible then then objections that such a process is impossible or even unlikely become much harder to maintain.

You are not allowed to view links. Register or Login to view. provenance test

Green = ED0 / exact copy after gallows stripping
Blue = ED1 after gallows stripping
Orange = ED2 after gallows stripping
Red = no prior match

Core tokens requiring prior source


f1v tokenstrippedf1v locparentparent strippedsourcemethod
kchsychsy1:1cphoychoyf1r:7:4ED1
chadaiinchadaiin1:2chodaiinchodaiinf1r:16:6ED1
oltcheyolchey1:4okcheyocheyf1r:25:4ED1
charchar1:5ckharcharf1r:2:2ED0
yteeayyeeay2:1NO PRIOR MATCHNO MATCH
ochyochy2:4cphychyf1r:13:4ED1
dchodcho2:5ychoychof1r:17:1ED1
lkodylody2:6teodyeodyf1r:10:3ED1
okodaroodar2:7odarodarf1r:7:1ED1
chodychody2:8cphoychoyf1r:7:4ED1
dksheeydsheey3:5shekysheyf1r:3:2ED2
daldal3:8daldalf1r:25:5ED0
chokeochoeo4:2choteychoeyf1r:27:4ED1
dairdair4:3dairdairf1r:5:1ED0
socheysochey4:5sckheyscheyf1r:12:7ED1
potoyooy5:1otoloolf1r:21:1ED1
sholshol5:2sholsholf1r:1:5ED0
cphoalchoal5:4cfholcholf1r:9:3ED1
otoaiinooaiin5:8ooiinooiinf1r:4:1ED1
shoshyshoshy5:9shoshyshoshyf1r:11:7ED0
choeeechoeee7:3choteychoeyf1r:27:4ED2
daiindaiin8:11daiindaiinf1r:4:5ED0
taoraor10:1ararf1r:1:3ED1
chotcheychochey10:2chocthychochyf1r:8:4ED1
chodarchodar10:7ctholdarcholdarf1r:24:8ED1

Same-page local derivations

f1v tokenstrippedloclocal parentparent strippedparent locmethod
cfharchar1:6charchar1:5ED0
charchar2:2charchar1:5ED0
ckhychy3:1kchsychsy1:1ED1
ckhocho3:2dchodcho2:5ED1
ckhychy3:3ckhychy3:1ED0
shyshy3:4ckhychy3:1ED1
cthychy3:6ckhychy3:1ED0
kotchodyochody3:7chodychody2:8ED1
doldol4:1daldal3:8ED1
damdam4:4daldal3:8ED1
chokodychoody4:6chodychody2:8ED1
dairdair5:3dairdair4:3ED0
dardar5:5daldal3:8ED1
cheychey5:6kchsychsy1:1ED1
todyody5:7lkodylody2:6ED1
chokychoy6:1kchsychsy1:1ED0
cholchol6:2ckhocho3:2ED1
ctholchol6:3cholchol6:2ED0
sholshol6:4sholshol5:2ED0
okaloal6:5daldal3:8ED1
dolcheydolchey6:6oltcheyolchey1:4ED1
chodochodo6:7chodychody2:8ED1
lollol6:8doldol4:1ED1
chychy6:9ckhychy3:1ED0
cthychy6:10ckhychy3:1ED0
cheolcheol7:4cholchol6:2ED1
doldol7:5doldol4:1ED0
ctheychey7:6cheychey5:6ED0
ykolyol7:7doldol4:1ED1
doldol7:8doldol4:1ED0
dolodolo7:9doldol4:1ED1
ykolyol7:10ykolyol7:7ED0
okolool8:1doldol4:1ED1
sholshol8:2sholshol5:2ED0
kolol8:3doldol4:1ED1
kechyechy8:4ochyochy2:4ED1
cholchol8:5cholchol6:2ED0
cholchol8:7cholchol6:2ED0
ctholchol8:8cholchol6:2ED0
chodychody8:9chodychody2:8ED0
cholchol8:10cholchol6:2ED0
shorshor9:1sholshol5:2ED1
okolool9:2okolool8:1ED0
cholchol9:3cholchol6:2ED0
doldol9:4doldol4:1ED0
dardar9:6dardar5:5ED0
sholshol9:7sholshol5:2ED0
dchordchor9:8dchodcho2:5ED1
otchoocho9:9ochyochy2:4ED1
dardar9:10dardar5:5ED0
shodyshody9:11chodychody2:8ED1
daldal10:3daldal3:8ED0
chodychody10:4chodychody2:8ED0
schodyschody10:5chodychody2:8ED1
polol10:6kolol8:3ED0
This is for f20v.  It's showing that the sources for the words on this page come from these sheets:

  q1s1: +17  [CORE]
  q1s3: +1  [RESIDUE]
  q1s4: +2  [RESIDUE]
  q2s2: +1  [RESIDUE]
  q3s1: +1  [RESIDUE]

Residue are tokens that don't come from the core sheet on the initial check and that's how they're listed in the table.

RESIDUE TOKENS (NEWLY ADDED BY RESIDUE SHEETS)
  choraly | stripped choraly | f20v:8:2 -> chodaly | stripped chodaly | f3v:6:4 | q1s3 | ED1
  choldy | stripped choldy | f20v:1:7 -> cpholdy | stripped choldy | f4r:8:5 | q1s4 | ED0
  shain | stripped shain | f20v:8:1 -> shain | stripped shain | f4r:3:7 | q1s4 | ED0
  choiin | stripped choiin | f20v:2:5 -> choiin | stripped choiin | f10v:2:5 | q2s2 | ED0
  opydy | stripped oydy | f20v:1:8 -> opydy | stripped oydy | f17r:1:3 | q3s1 | ED0

After a recheck is run, all of those residue tokens have parents on quire 1, sheet 1.

RESIDUE CORE-RECHECK
  RESOLVED ED1: choldy | stripped choldy | f20v:1:7 -> chody | stripped chody | f1v:2:8 | q1s1
  RESOLVED ED1: opydy | stripped oydy | f20v:1:8 -> tody | stripped ody | f1v:5:7 | q1s1
  RESOLVED ED1: choiin | stripped choiin | f20v:2:5 -> chtaiin | stripped chaiin | f1r:2:5 | q1s1
  RESOLVED ED1: shain | stripped shain | f20v:8:1 -> shaiin | stripped shaiin | f1r:22:2 | q1s1
  RESOLVED ED2: choraly | stripped choraly | f20v:8:2 -> cthoary | stripped choary | f1r:3:6 | q1s1

Which reduces the retained core source to 1 sheet: quire 1, sheet 1.

Again, only length >= 3 were checked.

f20v provenance test

Green = ED0 / exact copy after gallows stripping
Blue = ED1 after gallows stripping
Orange = ED2 after gallows stripping
Red = no prior match

Core tokens requiring prior source

f20v tokenstrippedf20v locparentparent strippedsourcemethod
faiisaiis1:1kaiinaiinf1r:23:6ED1
okoyooy1:3potoyooyf1v:5:1ED0
shyshy1:4kshyshyf1r:16:3ED0
opchyochy1:5ochyochyf1v:2:4ED0
choldycholdy1:7cpholdycholdyf4r:8:5ED0
opydyoydy1:8opydyoydyf17r:1:3ED0
sossos2:1kososf1r:8:7ED1
ykaiinyaiin2:2ykaiinyaiinf1r:3:4ED0
cheolcheol2:3cheolcheolf1v:7:4ED0
choiinchoiin2:5choiinchoiinf10v:2:5ED0
checthychechy2:6chocthychochyf1r:8:4ED1
chodaiinchodaiin2:9chodaiinchodaiinf1r:16:6ED0
fshodchyshodchy5:5shokchyshochyf4v:6:2ED1
doiiindoiiin6:1daiiindaiiinf1r:21:2ED1
dardar6:3dardarf1r:23:4ED0
charchar6:7ckharcharf1r:2:2ED0
sheoysheoy7:5shekysheyf1r:3:2ED1
soaiinsoaiin7:8otaiinoaiinf1r:4:6ED1
shainshain8:1shainshainf4r:3:7ED0
choralychoraly8:2chodalychodalyf3v:6:4ED1
keodyeody9:2teodyeodyf1r:10:3ED0
okoiinooiin11:1ooiinooiinf1r:4:1ED0

Same-page local derivations

f20v tokenstrippedloclocal parentparent strippedparent locmethod
qopyqoy1:6okoyooy1:3ED1
cphychy1:9shyshy1:4ED1
cholchol2:4cheolcheol2:3ED1
otolool2:7okoyooy1:3ED1
cholchol2:8cholchol2:4ED0
otyoy2:10okoyooy1:3ED1
okchyochy3:1opchyochy1:5ED0
shosho3:2shyshy1:4ED1
kcholchol3:3cholchol2:4ED0
sholshol3:4cholchol2:4ED1
chcthychchy3:5checthychechy2:6ED1
qotyqoy3:6qopyqoy1:6ED0
chychy3:7cphychy1:9ED0
tolol3:8otolool2:7ED1
shyshy3:9shyshy1:4ED0
qotchyqochy3:10opchyochy1:5ED1
shosho4:1shosho3:2ED0
aiinaiin4:3faiisaiis1:1ED1
sholshol4:4sholshol3:4ED0
daiindaiin4:5ykaiinyaiin2:2ED1
tsholshol5:1sholshol3:4ED0
otoroor5:2okoyooy1:3ED1
sholshol5:3sholshol3:4ED0
shosshos5:4sossos2:1ED1
otchyochy5:6opchyochy1:5ED0
chcphychchy5:7chcthychchy3:5ED0
chockhychochy6:2checthychechy2:6ED1
cheockhycheochy6:4checthychechy2:6ED1
shosshos6:5shosshos5:4ED0
cheoscheos6:6cheolcheol2:3ED1
cthaiinchaiin6:8choiinchoiin2:5ED1
shocthyshochy7:1fshodchyshodchy5:5ED1
shosho7:2shosho3:2ED0
cthychy7:3cphychy1:9ED0
daiindaiin7:4daiindaiin4:5ED0
teyey7:6otyoy2:10ED1
shosho8:3shosho3:2ED0
chychy8:5cphychy1:9ED0
daiindaiin8:6daiindaiin4:5ED0
ykchyychy9:1opchyochy1:5ED1
chocho9:3cphychy1:9ED1
cthychy9:4cphychy1:9ED0
cholchol9:5cholchol2:4ED0
shdshd9:6shyshy1:4ED1
qotyqoy9:7qopyqoy1:6ED0
shokaiinshoaiin10:1soaiinsoaiin7:8ED1
chocthychochy10:2chockhychochy6:2ED0
cholchol10:3cholchol2:4ED0
daiindaiin10:4daiindaiin4:5ED0
chychy10:5cphychy1:9ED0
chorchor10:6cholchol2:4ED1
etyey10:7teyey7:6ED0
cheychey11:2cphychy1:9ED1
cpholchol11:3cholchol2:4ED0
chorchor11:4chorchor10:6ED0

Summary
  • Core source assignments: 22
  • Same-page local derivations: 55
  • Same-page ED0/copy: 29
  • Same-page ED1: 26
f44r provenance test

Initially, this page reduces to 1 core sheet + 4 more sheets supplying a few extra words.
 
SHEET CLASSIFICATION
  q1s3: +15  [CORE]
  q1s2: +4  [SECONDARY]
  q4s1: +2  [RESIDUE]
  q6s2: +2  [RESIDUE]
  q2s4: +2  [RESIDUE]

RESIDUE TOKENS (NEWLY ADDED BY RESIDUE SHEETS)
  dshor | stripped dshor | f44r:6:1 -> dshor | stripped dshor | f25v:5:3 | q4s1 | ED0
  dchckhy | stripped dchchy | f44r:9:3 -> dchckhy | stripped dchchy | f25r:4:1 | q4s1 | ED0
  ypsholy | stripped ysholy | f44r:1:4 -> yshol | stripped yshol | f42r:8:1 | q6s2 | ED1
  kshotol | stripped shool | f44r:3:3 -> shotol | stripped shool | f42r:17:1 | q6s2 | ED0
  oracphy | stripped orachy | f44r:1:2 -> torchy | stripped orchy | f13r:8:3 | q2s4 | ED1
  oair | stripped oair | f44r:3:1 -> koair | stripped oair | f13v:1:1 | q2s4 | ED0

RESIDUE CORE-RECHECK
  RESOLVED ED2: oracphy | stripped orachy | f44r:1:2 -> octhy | stripped ochy | f3r:12:4 | q1s3
  RESOLVED ED2: ypsholy | stripped ysholy | f44r:1:4 -> shol | stripped shol | f3r:11:2 | q1s3
  RESOLVED ED1: oair | stripped oair | f44r:3:1 -> dair | stripped dair | f3r:19:5 | q1s3
  RESOLVED ED1: kshotol | stripped shool | f44r:3:3 -> shol | stripped shol | f3r:11:2 | q1s3
  RESOLVED ED1: dshor | stripped dshor | f44r:6:1 -> shor | stripped shor | f3r:15:2 | q1s3
  RESOLVED ED1: dchckhy | stripped dchchy | f44r:9:3 -> chckhy | stripped chchy | f6r:2:2 | q1s3

After the residue recheck, this page reduces down to 2 sheets with 0 ED2 or unexplained words.
  q1s3
  q1s2

Green = ED0 / exact copy after gallows stripping
Blue = ED1 after gallows stripping
Orange = ED2 after gallows stripping
Red = no prior match / match not in selected minimum set

Core tokens requiring prior source

f44r tokenstrippedf44r locparentparent strippedsourcemethod
tshodpyshody1:1shodyshodyf7r:7:6ED0
oracphyorachy1:2torchyorchyf13r:8:3ED1
koeesoees1:3oeesoeesf6v:2:1ED0
ypsholyysholy1:4ysholysholf42r:8:1ED1
shyshy1:5shtyshyf2v:1:9ED0
ydshydsh2:1ykshyshf2r:8:7ED1
dyeeesdyeees2:2deeesdeeesf7v:4:8ED1
ytyyy2:3ykyyyf2r:9:8ED0
okcheyochey2:5okcheyocheyf6v:8:3ED0
qykcheyqychey2:6qokcheyqocheyf3r:13:5ED1
dchydchy2:7dchydchyf2r:6:6ED0
oairoair3:1koairoairf13v:1:1ED0
ekokeeyeoeey3:2qokeeyqoeeyf3r:13:1ED1
kshotolshool3:3shotolshoolf42r:17:1ED0
otolool3:4okoloolf2r:12:1ED0
daiindaiin3:5daiindaiinf2r:1:3ED0
ychorychor4:1ychorychorf6v:8:1ED0
damdam4:5damdamf3r:2:3ED0
cheekycheey5:4chkeeycheeyf7v:3:2ED0
dshordshor6:1dshordshorf25v:5:3ED0
qotchyqochy6:5qokchyqochyf3v:13:3ED0
qotshqosh7:1oshoshf3v:12:1ED1
dchckhydchchy9:3dchckhydchchyf25r:4:1ED0
qokchorqochor10:1qokchorqochorf3r:12:2ED0
ytshoysho11:1yshoyshof6r:14:1ED0

Same-page local derivations

f44r tokenstrippedloclocal parentparent strippedparent locmethod
okyoy2:4ytyyy2:3ED1
ototaooa3:6otolool3:4ED1
ykcheyychey4:2okcheyochey2:5ED1
ykchyychy4:3dchydchy2:7ED1
chodychody4:4tshodpyshody1:1ED1
toyoy5:1okyoy2:4ED0
qoteeyqoeey5:2ekokeeyeoeey3:2ED1
chorchor5:3ychorychor4:1ED1
sheeysheey5:5cheekycheey5:4ED1
yteyyey5:6ytyyy2:3ED1
daiindaiin5:7daiindaiin3:5ED0
ytcholychol6:2ychorychor4:1ED1
shyshy6:3shyshy1:5ED0
otchyochy6:6okcheyochey2:5ED1
dardar6:7damdam4:5ED1
oydoyd7:2okyoy2:4ED1
sholshol7:3kshotolshool3:3ED1
qotshyqoshy7:4qotchyqochy6:5ED1
oytyoyy7:5ytyyy2:3ED1
chomchom7:6chorchor5:3ED1
pshyshy8:1shyshy1:5ED0
opcheyochey8:2okcheyochey2:5ED0
qopchyqochy8:3qotchyqochy6:5ED0
ofcheyochey8:4okcheyochey2:5ED0
sholshol8:5sholshol7:3ED0
ykchyychy8:6ykchyychy4:3ED0
otcholochol9:1ytcholychol6:2ED1
qokyqoy9:4okyoy2:4ED1
qotchyqochy9:5qotchyqochy6:5ED0
qokchyqochy9:6qotchyqochy6:5ED0
qokyqoy9:7qokyqoy9:4ED0
okchyochy10:2otchyochy6:6ED0
qotoqoo10:3qokyqoy9:4ED1
ykolyol10:4otolool3:4ED1
chokychoy10:5chodychody4:4ED1
chokychoy10:6chokychoy10:5ED0
cholchol10:7chorchor5:3ED1
damdam10:8damdam4:5ED0
qockhyqochy11:2qotchyqochy6:5ED0
okchodyochody11:3chodychody4:4ED1

Summary
  • Core source assignments: 25
  • Same-page local derivations: 40
  • Same-page ED0/copy: 16
  • Same-page ED1: 24
And here's what happens when you try the same test on a Scribe 2 page.

SHEET CLASSIFICATION
  q5s2: +27  [CORE]
  q5s1: +3  [RESIDUE]
  q1s3: +2  [RESIDUE]

RESIDUE TOKENS (NEWLY ADDED BY RESIDUE SHEETS)
  okaldy | stripped oaldy | f43r:2:8 -> okaldy | stripped oaldy | f33r:2:4 | q5s1 | ED0
  pshdar | stripped shdar | f43r:10:1 -> tshdar | stripped shdar | f33r:1:1 | q5s1 | ED0
  otary | stripped oary | f43r:11:12 -> okary | stripped oary | f33v:8:12 | q5s1 | ED0
  otolol | stripped oolol | f43r:11:8 -> otolom | stripped oolom | f3r:18:7 | q1s3 | ED1
  shochol | stripped shochol | f43r:13:8 -> shocthol | stripped shochol | f6v:6:2 | q1s3 | ED0

RESIDUE CORE-RECHECK
  RESOLVED ED1: okaldy | stripped oaldy | f43r:2:8 -> okaly | stripped oaly | f34r:13:5 | q5s2
  RESOLVED ED1: pshdar | stripped shdar | f43r:10:1 -> chdar | stripped chdar | f34v:8:7 | q5s2
  NOT RESOLVED: otolol | stripped oolol | f43r:11:8
  RESOLVED ED1: otary | stripped oary | f43r:11:12 -> okaly | stripped oaly | f34r:13:5 | q5s2
  RESOLVED ED2: shochol | stripped shochol | f43r:13:8 -> shotchy | stripped shochy | f34r:8:1 | q5s2

Most words still reduced down to 2 sheets.
q5s2
q1s3

There is one word that can't be resolved by ED1 or ED2, "otolol".  And, there are 9 words that don't cleanly reduce down to a minimum sheet limit.  They do have parents and they can be found, but it's not as easy to reduce as Scribe 1.  Exactly what this is saying at the moment, I'm not sure.  If Scribe 2 used the same method as Scribe 1, their selection of source sheets was a good bit wider than Scribe 1.

Now, all these tests were run with Takahashi. If I instead use ZL transcription for this page, and allow for more initial sheets to be found in the first run, it can be reduced down to 3 sheets

q5s2
q5s1 -> added 
q1s3

with "qosheckhey" being the only word it couldn't resolve to having an ED1 or ED2 parent.

Regardless of which transcription is used to locate the source, you can still see that the "proposed" same-page copy mutate process is on par with Scribe 1 pages.


f43r provenance test

Green = ED0 / exact copy after gallows stripping
Blue = ED1 after gallows stripping
Orange = ED2 after gallows stripping
Red = no prior match / match not in selected minimum set

Core tokens requiring prior source

f43r tokenstrippedf43r locparentparent strippedsourcemethod
tarodaiinarodaiin1:1MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
ytedyyedy1:2ykedyyedyf33v:4:8ED0
chodychody1:3tchodychodyf6v:6:1ED0
ofchtarochar1:4MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
chcphedychchedy1:5chcthedychchedyf34r:14:11ED0
yparyar1:6ykaryarf33v:5:11ED0
sholshol1:7sholsholf3r:11:2ED0
folorolor1:8olorolorf39r:12:2ED0
aiinaiin1:9aiinaiinf3r:20:6ED0
cphhychhy1:10MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
oteoloeol2:2oteoloeolf3r:16:1ED0
okaldyoaldy2:8okaldyoaldyf33r:2:4ED0
daraldaral2:9MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
otchdyochdy2:10otchdyochdyf34r:10:4ED0
ytyyy3:1ytyyyf3v:13:4ED0
sheshe3:4shekshef34r:15:4ED0
qotydyqoydy3:9qokodyqoodyf3r:13:4ED1
qotalyqoaly4:7qopalqoalf3r:1:2ED1
shekodysheody5:2sheodysheodyf34r:2:2ED0
qotarqoar5:4qoarqoarf6v:1:5ED0
choetchychoechy5:7chocthychochyf6r:4:5ED1
pshesyshesy7:1sheysheyf3r:11:5ED1
kshdyshdy7:3shdyshdyf33v:4:2ED0
shocphhyshochhy7:8shotchyshochyf34r:8:1ED1
dytydydyydy7:9dydydydyf39r:6:7ED1
qoiiinqoiiin8:5qokaiinqoaiinf39r:8:3ED1
sheeekysheeey8:6sheeysheeyf33v:8:1ED1
oldold8:10MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
dshedydshedy9:1MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
odainodain9:5MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
pshdarshdar10:1tshdarshdarf33r:1:1ED0
qockhhdyqochhdy10:6qokchdyqochdyf34r:7:6ED1
otololoolol11:8otolomoolomf3r:18:7ED1
otaryoary11:12okaryoaryf3v:10:6ED0
qokolqool12:3qokolqoolf3r:5:1ED0
qosheckhhyqoshechhy13:2NO PRIOR MATCHNO MATCH
odeedyodeedy13:3okeedyoeedyf33v:10:5ED1
qeokehyqeoehy13:4qokeeyqoeeyf3r:13:1ED2
shododyshodody13:7MATCH EXISTS BUT NOT IN SELECTED MINIMUM SETNOT SELECTED
shocholshochol13:8shoctholshocholf6v:6:2ED0
chckhhhychchhhy14:5chcfhhychchhyf39r:8:7ED1

Same-page local derivations

f43r tokenstrippedloclocal parentparent strippedparent locmethod
dardar1:11yparyar1:6ED1
yteodyyeody2:1ytedyyedy1:2ED1
ytedyyedy2:3ytedyyedy1:2ED0
chetychey2:5cphhychhy1:10ED1
dardar2:6dardar1:11ED0
aiiraiir2:7aiinaiin1:9ED1
daiindaiin2:11aiinaiin1:9ED1
daldal2:12dardar1:11ED1
ytyyy3:2ytyyy3:1ED0
otyoy3:3ytyyy3:1ED1
odyody3:5otyoy3:3ED1
olorolor3:6folorolor1:8ED0
kaiinaiin3:7aiinaiin1:9ED0
chkychy3:8cphhychhy1:10ED1
dardar3:10dardar1:11ED0
aiinaiin3:11aiinaiin1:9ED0
ykamyam3:12yparyar1:6ED1
yteyyey4:1ytedyyedy1:2ED1
tedyedy4:2ytedyyedy1:2ED1
karar4:3yparyar1:6ED1
choltycholy4:6chodychody1:3ED1
chedychedy4:8chodychody1:3ED1
otyoy4:9otyoy3:3ED0
otamoam4:10ykamyam3:12ED1
ykoryor5:1yparyar1:6ED1
qotodyqoody5:3qotydyqoydy3:9ED1
okedyoedy5:5ytedyyedy1:2ED1
dardar5:6dardar1:11ED0
damdam5:8dardar1:11ED1
ytamyam5:9ykamyam3:12ED0
kchedychedy6:1chedychedy4:8ED0
chedychedy6:2chedychedy4:8ED0
dalydaly6:3daldal2:12ED1
cheodycheody6:4chodychody1:3ED1
oteyoey7:2otyoy3:3ED1
opchdyochdy7:4otchdyochdy2:10ED0
kedaredar7:5dardar1:11ED1
okedyoedy7:6okedyoedy5:5ED0
chdychdy7:7chodychody1:3ED1
pchdychdy7:10chdychdy7:7ED0
kedyedy7:11tedyedy4:2ED0
damdam7:12damdam5:8ED0
ytchedyychedy8:1chedychedy4:8ED1
chedychedy8:2chedychedy4:8ED0
cheodycheody8:3cheodycheody6:4ED0
shyshy8:4sheshe3:4ED1
chedychedy8:7chedychedy4:8ED0
shyshy8:8shyshy8:4ED0
otaiinoaiin8:9aiinaiin1:9ED1
qotedyqoedy9:2qotydyqoydy3:9ED1
dordor9:3dardar1:11ED1
cheeycheey9:4chetychey2:5ED1
shedshed10:2sheshe3:4ED1
odyody10:3odyody3:5ED0
qotedyqoedy10:4qotedyqoedy9:2ED0
yfchdyychdy10:5otchdyochdy2:10ED1
opchdyochdy10:7otchdyochdy2:10ED0
daiindaiin10:8daiindaiin2:11ED0
qokedyqoedy10:9qotedyqoedy9:2ED0
qotarqoar10:10qotarqoar5:4ED0
ytchedyychedy11:1ytchedyychedy8:1ED0
sholshol11:3sholshol1:7ED0
toldyoldy11:4okaldyoaldy2:8ED1
shodyshody11:5chodychody1:3ED1
otchdyochdy11:6otchdyochdy2:10ED0
shdyshdy11:7kshdyshdy7:3ED0
shdshd11:9sheshe3:4ED1
olkyoly11:10otyoy3:3ED1
ytolyol11:11ykoryor5:1ED1
chekychey11:13chetychey2:5ED0
dordor12:1dordor9:3ED0
sholshol12:2sholshol1:7ED0
shedyshedy12:4chedychedy4:8ED1
qotedyqoedy12:5qotedyqoedy9:2ED0
qokehdyqoehdy12:6qotedyqoedy9:2ED1
qokodyqoody12:7qotodyqoody5:3ED0
okehdyoehdy12:8otchdyochdy2:10ED1
otedyoedy12:9okedyoedy5:5ED0
shedyshedy12:10shedyshedy12:4ED0
otyoy12:11otyoy3:3ED0
ytyyy12:12ytyyy3:1ED0
saiinsaiin12:14aiinaiin1:9ED1
tshedshed13:1shedshed10:2ED0
qotedyqoedy13:5qotedyqoedy9:2ED0
daiindaiin13:6daiindaiin2:11ED0
chckhychchy13:9cphhychhy1:10ED1
ykedyyedy13:10ytedyyedy1:2ED0
ykeodyyeody14:1yteodyyeody2:1ED0
checkhychechy14:2choetchychoechy5:7ED1
chotehychoehy14:3choetchychoechy5:7ED1
odainodain14:4odainodain9:5ED0
aiinaiin14:6aiinaiin1:9ED0
I don't think this shows anything at all. Voynichese is quite regular: the number of common characters is small and their sequences are quite rigid. Most of the vocabulary has many ED1 words. This is a property of the word structure. Let's ask a different question, how many words on You are not allowed to view links. Register or Login to view. have only one ED1 source on f1r, so they can be traced to a single source? Also, I don't understand the rationale for stripping the gallows, this looks like a cheap trick.
(oshfdk is a little more concise, but I wrote this before seeing their reply and gets at a few more things here)

(31-05-2026, 12:57 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Torsten Timm has pointed out, it's likely a very human method which is not going to be duplicated in code easily.
One might even say it's not going to replicate in code easily!

No one is arguing that there aren't clusters of words in short edit distance, nor that a process which captures that fact can't [edited from "can"] model the text's statistics. It is interesting, and serves as a broad-ranging rebuttal to a number of statistical analyses that had been purported to suggest meaningfulness, that those edit networks plus those constraints have enough information to reconstitute those features.

This is a different claim:
(31-05-2026, 12:57 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.Interpretation: This table shows that with the exception of 1 token, all other tokens are a product of copy/mutate from f1r.
When you claim the tokens in fv1 are the product of such and such a process, you are no longer claiming to have modeled them. You are claiming to know as a matter of historical fact what the scribe did. This is an extremely strong claim for the banal reason that it's hard to prove what someone did 600 years ago, but there are several other factors that make inferring from the model hard.

These models are in a sense "closely held" to the source text. At risk of broadsiding Torsten, copy-mutate was selected specifically because it reproduces the short edit distances. To the limits of his and his coauthor's argument, that is fine, but it is hardly an emergent property---and without close reading his paper for it, I don't recall him asserting it is. Likewise, the fact that both your models force words through Voynich-like networks of words at rates derived from summary statistics gleaned from the manuscript means that they are likely mutating through spaces of words that have similar properties. The reason why Torsten's paper has force, and why it has the conclusions it does, is that it shows that information is sufficient to produce a Voynich-like text, and so an auto-citation process that has substantially similar properties would be expected to result in a statistically Voynich-like text. Timm and Schinner do not claim the text in the Voynich were produced that way, merely that it is a serious possibility; in fact, they rule out a very literal interpretation of their model! It is entirely possible that their and your models depend enough on patterns that arose from an underlying process interacting with meaningful text that it is reproducing them without reproducing the method. Ruling this is out is very hard. 

A genuine path forward for these models would be to show that they depend less on the Voynich's summary statistics. Your analysis of the gallows letters, which I do not believe is shared by Timm and Schinner's model, is a case in point. If it could be shown that line start gallows letters were appearing more often at line start because of some simple rule that did not immediately imply the distribution, e.g. that it was emergent, it would lend credence to the idea that these features were dependent on simple rules undergirding copy-mutate rather than the distributions arising in the Voynich for unproved reasons. Your argument (section 4.2 of your paper) is explicitly that these features were inserted based on observations of the manuscript, so you have defined the mutate process to have these properties, and are actually assuming the consequent when you claim to have explained it.

In fairness to everyone, it may be the case that there is no simple rule underlying the gallows distribution. It could be that the scribes liked them line start because they looked like capital letters at the start of lines in humanist manuscripts. But this is what I'm driving at. If that's all there is to it, it may be the best we can do is say that their process had a bias towards gallows in line start words and we may not be able to formally separate premise (we observe the bias) from conclusion (it is a product of an arbitrary choice).

By the by, if you are in fact "reproducing" the Voynich, failure to incorporate Currier's curve-line system observations and similar analyses seems like fair game to me. It also strikes me as the kind of "observable orthographic structure" you say the model addresses in in 7.6 of your paper. This is largely an aside to the main point here, which is that I don't think you're proving your interpretation, but I'm not clear why statistics about letter bases are not part of the orthographic structures in the Voynich.
(31-05-2026, 02:40 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I don't think this shows anything at all. Voynichese is quite regular: the number of common characters is small and their sequences are quite rigid. Most of the vocabulary has many ED1 words. This is a property of the word structure. Let's ask a different question, how many words on You are not allowed to view links. Register or Login to view. have only one ED1 source on f1r, so they can be traced to a single source? Also, I don't understand the rationale for stripping the gallows, this looks like a cheap trick.

I think you're focusing on a different question than the one I'm testing.

If the claim were simply that many Voynich words have ED1 neighbors, I would agree that this is not very interesting. Given the character inventory and regular word structure, ED1 relationships are expected. The purpose of the provenance test is not to show that an ED1 source exists somewhere in the manuscript. The results are showing that after same-page copy/mutate are removed, the remaining source words repeatedly collapse into a very small number of sheets as the source of those pages. I can easily imagine a scribe with a completed sheet in front of them and using that as the source for the page they're working on. That is what the evidence is pointing to.

As for gallows stripping, that was done because gallows are overwhelmingly productive in prefix positions. If gallows behave as a frequent prefix operation rather than as ordinary internal characters, then treating chol, kchol, tchol, etc. as completely unrelated forms obscures rather than reveals the family structure.  Whether that assumption is justified is certainly open to discussion, but it was not introduced simply to manufacture matches. Thus far, I do not believe many researchers have seriously considered the possibility that gallows may function primarily as a decorative addition rather than as a core lexical element. I am testing whether they might be. If removing the gallows destroys the structure, then the hypothesis fails. If removing them reveals stable word families, source relationships, and copy/mutate chains that remain consistent across pages, then that is evidence that the gallows may not be carrying the same kind of information as the rest of the word. So far, stripping gallows has not weakened the provenance results. If anything, it has made the underlying relationships easier to see. So, perhaps the cheap trick is not mine to claim. Perhaps gallows is a cheap trick of the scribes' invention.

Your question about how many words on You are not allowed to view links. Register or Login to view. have only a single ED1 source is a reasonable one, but it is a different test from the provenance analysis shown above.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19