The Voynich Ninja
A One-Page Ledger Method for Generating Voynich-Like Text - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: A One-Page Ledger Method for Generating Voynich-Like Text (/thread-5752.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11


RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 17-05-2026

(17-05-2026, 08:44 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.I think you're missing the point. This is not about generating perfect Voynichese.  This is about demonstrating a possible method for it's production.

You are right, I do miss the point of all copy and modify approaches because to me they look like imitation with extra steps. None of the copy and modify proposals that I know suggest a specific set of rules that I can just follow to generate Voynichese. The rules just provide a few of "you can do this, you can do that" and then you still have to use extra steps to ensure the result is good enough, and even then it's obviously not good enough to pass as real Voynichese. I think I can generate some plausible Voynichese from the top of my head just following CLS and adding some variation.

Let's try: 

Shedy.chol.or.daiin.olkedy.chor.odam
dchedy.ytoair.chey.dar,choldy.saiin.okain
qoty.odaiin.Shor.cheody.opchey.ar.okchy
okeedy.sair.lol.tcheody.or.Shody.otaim

No algorithm, no copy and modify, just attempting to imitate Voynichese and more or less following known visual patterns. I don't think this will pass the statistical test, but given that copy and modify creates a lot of weird words, it won't pass some tests either.

So, my main question about copy and modify methods - why bother? What's the advantage of writing down some specific rules and then adding more and more complexity, when just asking a scribe to imitate existing script and making sure that curves mostly start and sticks mostly follow (and don't mingle) produces the same result?

Edit: BTW, this made me think a bit. I know that some artists are very good at identifying and extending visual patterns. What would happen if some artist is given a few lines of Voynichese as a sample and then asked to continue the pattern in a similar visual style. What would the result look like? Will it be worse or better than other proposals for generating meaningless Voynichese.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

(17-05-2026, 09:55 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(17-05-2026, 08:44 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.I think you're missing the point. This is not about generating perfect Voynichese.  This is about demonstrating a possible method for it's production.

You are right, I do miss the point of all copy and modify approaches because to me they look like imitation with extra steps. None of the copy and modify proposals that I know suggest a specific set of rules that I can just follow to generate Voynichese. The rules just provide a few of "you can do this, you can do that" and then you still have to use extra steps to ensure the result is good enough, and even then it's obviously not good enough to pass as real Voynichese. I think I can generate some plausible Voynichese from the top of my head just following CLS and adding some variation.

Let's try: 

Shedy.chol.or.daiin.olkedy.chor.odam
dchedy.ytoair.chey.dar,choldy.saiin.okain
qoty.odaiin.Shor.cheody.opchey.ar.okchy
okeedy.sair.lol.tcheody.or.Shody.otaim

No algorithm, no copy and modify, just attempting to imitate Voynichese and more or less following known visual patterns. I don't think this will pass the statistical test, but given that copy and modify creates a lot of weird words, it won't pass some tests either.

So, my main question about copy and modify methods - why bother? What's the advantage of writing down some specific rules and then adding more and more complexity, when just asking a scribe to imitate existing script and making sure that curves mostly start and sticks mostly follow (and don't mingle) produces the same result?


You are ABSOLUTELY correct.  I could sit down and do a copy modify just as easily as you did.  We've seen enough Voynich we could do that.  And, just like you, I could make some pretty realistic looking Voynich.  But, just as you said, we likely wouldn't pass statistical checks.  Our eyes would tell us, "this is a good word," and "this is an ugly word."  But, without sets of rules, you and I would drift.  We would create gibberish.  Dr. Bowern did a test and had students intentionally try to create gibberish, and they did pretty good at it.  But, 30,000+ words of gibberish?  What we would end up with would look like word soup, UNLESS... we put down some rules.  Don't repeat vowels because daiiiiiin looks ugly.  Don't repeat consonants because chknt does not look like a word.  Humans need rules, otherwise they drift into doing... whatever.  I don't feel like turning my blinker on before I make this turn.  I don't feel like washing the dishes.  I don't feel like writing those long words.  Just short words from now on.

For the most part, the last page of the Herbal looks pretty much like the first page. You don't get that from just writing down things that "look like a word."  Whoever created the Voynich, and for whatever reason, had rules.  Grammatical rules, mnemonic rules, lexical rules, cryptographic rules, or... copy and modify rules.  If there were no rules, the Voynich would have been classified long ago as glossolalia, gibberish, word soup and wouldn't be in the Yale library with this forum still discussing it.  A human with just rules in their head could not sit down and write 30,000 words and maintain a structure throughout. Our brains wander and we'd create all kinds of crap instead of words. But, because there are rules and structure that defies logic, here you and I are.

Now let's assume for a moment... I AM ASSUMING... that this book was created in the method I'm demonstrating.  And let's ASSUME it was created to gain the favor of some patron.  A book some magus could hand to him and say, I understand this book, I can heal you with herbs, tell your future and even give advice on naked women in pools of green goo. A genuine hoax with a goal, patron support (Enochian.  It worked for Dee and Kelly and King Rudolph fell for it.).  Now let's assume again that patron had cryptographers, and there were a fair number of them around in the 15th century.  What would happen if the patron got his hands on that book and handed it to a cryptographer?  They'd immediately start trying to dissect it.  If it it were gibberish, it would have been discovered.  No rules, no patterns.  A substitution cypher?  Discovered, they have patterns.  A language, they have patterns.  So, if you wanted to create a book that would amaze your patron but not be readable... you'd want to create the Voynich. Something with enough rules that it looks real, but defies every attempt to decode. Something with strange and magical letters that almost look readable. Something with no punctuation! Sentences imply structure with nouns, verbs, adjectives which could be figured out... No. Can't have punctuation.  Something with... plants that nobody can point to and identify because then, they could look for the plant name.  Zodiac looking symbols but no constellations, which were well known. If there were constellations, you could point to a star and say, I know it's name. But with no constellations, you can put whatever word you want beside that star.  And just to sex it up a bit, some naked ladies holding phallic devices.  And then, you would have the perfect book to pull off your hoax.  And here's another thought along that line.  No mistakes?  If you're creating a hoax like the one I described, mistakes are irrelevant.  Wrote the wrong letter?  So what, now we have a hapax token.  Keep writing.  Had too much mead and made a weirdo or a splat.  Woo hoo... decoration...  Keep writing.  Just enough rules to make it look like a language but not enough rules that someone could 'decode' it.  And, if it means nothing anyway and the patron is amazed, mission accomplished. It can mean whatever you want it to mean.

So the Voynich does have rules, it's not the gibberish you and I would create.  And yes, it is imitation with an extra step. Steps on what to copy, steps on how to mutate and how to and steps on make the book look real without the drift of gibberish.  And, with what I'm suggesting, those rules would be on one piece of paper and in the scribes head.

I'm pretty certain there is an explanation for the Voynich.  Otherwise, it wouldn't exist.   And I will bet you money that when that discovery is made, it'll have rules.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 17-05-2026

(17-05-2026, 09:55 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Edit: BTW, this made me think a bit. I know that some artists are very good at identifying and extending visual patterns. What would happen if some artist is given a few lines of Voynichese as a sample and then asked to continue the pattern in a similar visual style. What would the result look like? Will it be worse or better than other proposals for generating meaningless Voynichese.

Well, that brings two definite thoughts to mind.

1. Artists have methods.  A set of mental rules they use to create their own unique works of art.  And sometimes, they had written rules.  Look at Da Vinci and Vitruvian man. A study on the human shape using the golden ratio (debated) as a rule for proportions.

2. Ohh... I'm pretty damn sure the 'artist' who created the Voynich illustrations was bad at visual patterns. Really bad.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Torsten - 17-05-2026

(17-05-2026, 09:55 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.What would happen if some artist is given a few lines of Voynichese as a sample and then asked to continue the pattern in a similar visual style. What would the result look like? Will it be worse or better than other proposals for generating meaningless Voynichese.

That's exactly what self-citation is — a scribe looking at existing text and continuing the pattern in a similar visual style. The "artist extending visual patterns" IS the scribe copying and modifying visible words.

You produced plausible-looking Voynichese by imitating known patterns. No cipher, no language, no encoding — just visual imitation. That's self-citation performed by you from memory. The medieval scribe did the same thing from visible source text on the desk in front of him. The difference is that you work from remembered patterns while the scribe worked from visible text — which is why the scribe's output shows stronger spatial clustering and source-word relationships than memory-based imitation would.

The question is what happens if you write more than a few words? You are not allowed to view links. Register or Login to view. (2022) tested exactly this. They asked untrained volunteers to produce language like text without semantic meaning — and the output converged on statistical properties similar to the VMS. The structure emerged naturally, without rules, without algorithms, without understanding. Just humans doing what humans do: copying and modifying what they see.

Bowern and Lindemann wrote: "We tested this point in an undergraduate class and found that beyond about 100 words, the task of writing language-like non-language is very difficult. It is too easy to make local repetitions [...]" [You are not allowed to view links. Register or Login to view., p. 289]. "This is an important point, because it clarifies that any scribe creating language-mimicking gibberish will sooner or later replace the tedious task of inventing more and more words by the much easier reduplication of existing text (and stick with this strategy). [You are not allowed to view links. Register or Login to view.] It might be easy to write a few words of invented text, but writing a whole book of over 37,000 words inevitably drives the scribe toward copying and modifying what is already visible on the page — simply because copying is more efficient than inventing.

(17-05-2026, 09:55 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.So, my main question about copy and modify methods - why bother? What's the advantage of writing down some specific rules and then adding more and more complexity, when just asking a scribe to imitate existing script and making sure that curves mostly start and sticks mostly follow (and don't mingle) produces the same result?

Because formalizing the process lets you test it. You can't test "a scribe imitating visual patterns" — it's too vague. You can test "a scribe copying visible words and modifying them by substituting similar glyphs" — it makes specific, falsifiable predictions about word networks, spatial clustering, frequency-connectivity correlation, and positional effects.

That is the purpose of the Timm & Schinner 2020 paper: "After all, the original VMS was not created by a computer program; the scribe had complete freedom to implement random personal aesthetic preferences, spontaneous impulses, or even idiosyncrasies. The scope of this work is not the 'elemental deconstruction' of the VMS to an exact (and complete) set of rules. We rather demonstrate the feasibility to algorithmically create a text as rich and complex as the VMS, using the strikingly simple self-citation method. As we will show in the following section, this generated text still is able to reproduce all the intriguing statistical key properties of the original VMS" [You are not allowed to view links. Register or Login to view., p. 11].

The rules themselves are simple — copy a word, 1) replace one or more glyphs by similar ones, 2) add or remove a prefix, 3) combine two source words to create a new word [Timm & Schinner 2020, p. 9]. The complexity comes from the human applying them. A human mind brings individual preferences formed during years of experience. Because of the scribes concept of language he preferred words of a certain length. Because of his concept of aesthetics he added paragraph-initial gallows and preferred certain glyph combinations while avoiding others. Because of the limitations of the writing material he had to shorten words at the end of lines. Because of the limitations of the copying process all words are more or less connected to each other. These preferences don't need to be formalized as rules — they emerge naturally from a human producing text. That is why any algorithm approximating this process will always be imperfect: it captures the mechanism but not the full human variability. The imperfection of the simulation doesn't mean the mechanism is wrong — it means the model is simpler than the scribe who wrote the Voynich text: "The VMS was created by a human being with all freedom to spontaneously make and break (intuitive or explicit) own rules. After all, from the viewpoint of complexity theory, most likely the minimal representation of the VMS is the manuscript itself” (Timm & Schinner 2024, p. 319).


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 18-05-2026

(17-05-2026, 11:47 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.The rules themselves are simple — copy a word, 1) replace one or more glyphs by similar ones, 2) add or remove a prefix, 3) combine two source words to create a new word [Timm & Schinner 2020, p. 9]. The complexity comes from the human applying them. A human mind brings individual preferences formed during years of experience. Because of the scribes concept of language he preferred words of a certain length. Because of his concept of aesthetics he added paragraph-initial gallows and preferred certain glyph combinations while avoiding others. Because of the limitations of the writing material he had to shorten words at the end of lines. Because of the limitations of the copying process all words are more or less connected to each other. These preferences don't need to be formalized as rules — they emerge naturally from a human producing text. That is why any algorithm approximating this process will always be imperfect: it captures the mechanism but not the full human variability. 

And that is pretty much what I found as well, including the difficulties of modelling the imperfections of human variability. Though my modelling is purely copy, edit distance 1, and rarely 2. I haven't tried combining words yet.

Since you are the authority on this, perhaps you have some thoughts on f1r? I've been using it as the seed, but when I examine it, it already appears to be well into the production system, almost as if there were earlier pages preceding it. My suspicion is that the method itself was already mature before the Voynich was started, and that You are not allowed to view links. Register or Login to view. may simply represent an entry point into an already established process.

Also, I should probably say this directly since you are here: it's genuinely an honor to have you look over my work. I was aware of your theories before starting mine, though not your exact methodology, so I hope none of this comes across as me trying to appropriate your ideas. The overlap honestly came from independently chasing the same structural behavior in the manuscript, and your work deserves enormous credit for identifying that local repetition long before I arrived.

Whenever I come to these forums with my work it always feels as if I too am standing on the shoulders of giants.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Torsten - 18-05-2026

(17-05-2026, 10:56 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And here's another thought along that line.  No mistakes?  If you're creating a hoax like the one I described, mistakes are irrelevant.  Wrote the wrong letter?  So what, now we have a hapax token.

In self-citation, copying words and modifying them IS the text generation method. Every word is a "mistake" in the sense that it's a modification of a source word. The hapax legomenon isn't a mistake — it's a moment where the modification produced a form that happened not to be repeated. The word "daiin" with 836 instances isn't more "correct" than the hapax "You are not allowed to view links. Register or Login to view." on You are not allowed to view links. Register or Login to view. — both were produced by the same process. One was used as a source for further copying, the other wasn't.

The scribe's real challenge is managing repetition. If you copy with too few modifications, you produce text like "qokeedy.qokeedy.qokeedy.qotey.qokeey.qokeey.otedy" on You are not allowed to view links. Register or Login to view. — which doesn't look like language. So you modify: substitute a glyph, add or remove a prefix, combine elements. The "mistakes" create the vocabulary. The exact repeats are what the scribe needs to avoid.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 18-05-2026

(18-05-2026, 12:29 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.
(17-05-2026, 10:56 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.And here's another thought along that line.  No mistakes?  If you're creating a hoax like the one I described, mistakes are irrelevant.  Wrote the wrong letter?  So what, now we have a hapax token.

In self-citation, copying words and modifying them IS the text generation method. Every word is a "mistake" in the sense that it's a modification of a source word. The hapax legomenon isn't a mistake — it's a moment where the modification produced a form that happened not to be repeated. The word "daiin" with 836 instances isn't more "correct" than the hapax "You are not allowed to view links. Register or Login to view." on You are not allowed to view links. Register or Login to view. — both were produced by the same process. One was used as a source for further copying, the other wasn't.

The scribe's real challenge is managing repetition. If you copy with too few modifications, you produce text like "qokeedy.qokeedy.qokeedy.qotey.qokeey.qokeey.otedy" on You are not allowed to view links. Register or Login to view. — which doesn't look like language. So you modify: substitute a glyph, add or remove a prefix, combine elements. The "mistakes" create the vocabulary. The exact repeats are what the scribe needs to avoid.

That's very close to what I found in the generator experiments. One of the major problems was preventing obvious visible repetition chains, so the generator ended up developing a kind of “don’t look stupid” local suppression behavior where recent words and near-clones are discouraged even though the system is fundamentally copy/mutate driven.

It never intentionally generates “mistakes.” But because the ledger itself is built from the Voynich vocabulary, including rare and malformed forms, the system naturally reproduces the same kinds of anomalies and hapaxes simply by reusing the same constrained ecology.

Right now, I'm preloading the validation ledger with the Scribe 1 Herbal structure.  However, I have also experimented with loading it only with the You are not allowed to view links. Register or Login to view. structure and then, have the 'scribe' expand the ledger whenever they add' a new midfix, suffix, etc.  I haven't gotten that working perfectly but I think that was a very viable method. They essentially created the validity ledger as they created the pages. Or, as I suggested in a previous reply, if f1r was not the first page in this production, then the ledger was possibly built from preexisting work.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - dashstofsk - 18-05-2026

I agree with much of what you and Torsten Timm have written.

However the flow of the writing suggests that the writer did not pause after each word to think of what ought to come next. So I am still inclined to believe that the method is more simple, and doesn't involve looking back at previous words.

I like to suppose that at each sitting the writer approached the task with a fresh mindset, and that this mindset persisted for the remainder of the sitting. So, unintentionally, he might that day have written more  iin than  in, had a liking for a particular suffix, wrote more gallow words than usual, used a lot more  s words or  a words or  r words, more  qo words, used more of a particular word. Unintentionally, the writing was different at each sitting.

At least in the big paragraph text of quires 13 and 20 I have found that the distribution of these and many such parameters is not uniform. Often many standard deviations away from what would be expected if the words, prefixes, suffices, characters etc. were distributed randomly. I have tried to say something of this in a number of posts. See

You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.


Also I do believe that the writing was affected by psychological factors. Just as in the gibberish writing trials where participants fell to using "word repetition or the reuse of prefixes and suffixes" so it is so in the VMS. Also they had "a tendency to write a series of long words, realize they had not written any short words recently, and then self-correct by switching to short words". This is also a feature in the VMS. See

You are not allowed to view links. Register or Login to view.


And for much the same reason I believe this can explain the fact that number of occurrences where gallows words come immediately after another gallow word is higher than expected ( at least in quires 13, 20 ).

The writer of the VMS was being psychologically lead. It is much easier on the mind not to have to think of new words but to reuse with modification something that you have written lately. It was all done 'on the fly', just as in the trials.

I believe it is also possible sometimes, by looking at a range of parameters, to identify where pages on opposite sides of a sheet were written at the same sitting. See

You are not allowed to view links. Register or Login to view.


As for the construction of gallows words I have tried to show that the majority of them could be constructed from a list of prefixes of varying frequency together from a list of suffices and that the choice of suffix is independent of the prefix. At each sitting the writer might have had a preference for some of other prefix or suffix, giving a semblance of local modification.

You are not allowed to view links. Register or Login to view.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Torsten - 18-05-2026

(18-05-2026, 11:03 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.However the flow of the writing suggests that the writer did not pause after each word to think of what ought to come next. So I am still inclined to believe that the method is more simple, and doesn't involve looking back at previous words.

If you already have an idea for the next word, nothing stops you from just writing it down. For instance, you might have two or three modifications of the current word in mind and write them in sequence without pausing. But when you need a fresh idea, the most efficient method is to look at a word already written. This is the reason D'Imperio (1978) wrote that when faced with producing meaningless "dummy text" to conceal encrypted messages, scribes "would naturally tend to repeat parts of neighboring strings with various small changes and additions" (D'Imperio 1978, p. 118).

The point is that the human mind has the tendency to repeat the same ideas: "It is a dramatic observation that when human subjects are asked to generate random sequences, they normally cannot produce sequences that satisfy accepted criteria for randomness" (You are not allowed to view links. Register or Login to view.: Generation of Random Sequences by Human Subjects: Cognitive Operations or Psychophysical Process?) This might be hard to imagine if you need some ideas for a few words. But after writing 100 or more words, it becomes exhausting to keep inventing new words. This is exactly what the Gaskell and Bowern experiment demonstrated — their participants naturally fell into "word repetition or the reuse of prefixes and suffixes" after about 100 words. Therefore you need a method to overcome the tendency to repeat yourself. The simplest method is to look at something already in your field of view and modify it. Not because anyone designed this method — because it's what human cognition defaults to when it runs out of ideas.


RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 18-05-2026

(18-05-2026, 11:03 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.As for the construction of gallows words I have tried to show that the majority of them could be constructed from a list of prefixes of varying frequency together from a list of suffices and that the choice of suffix is independent of the prefix. At each sitting the writer might have had a preference for some of other prefix or suffix, giving a semblance of local modification.

You are not allowed to view links. Register or Login to view.

Then this may interest you.  If you look at every word in the Herbal section that contains a gallows anywhere, and you strip all of those gallows out, around 98% of the words that are left are either a copy or edit distance 1 to a previously existing word.

This is me going out on a limb but, could they be nothing more than a cosmetic flourish?  Since the Voynich has no capital letters and only 4 gallows, could they have been used to 'simulate' a capital letter just to make it look more legitimate?

I'll say this.  If this book is a hoax or copy/mutate, someone or some group went out of their way to make it look legitimate.