![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A One-Page Ledger Method for Generating Voynich-Like Text - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: A One-Page Ledger Method for Generating Voynich-Like Text (/thread-5752.html) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 20-05-2026 (19-05-2026, 11:33 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.If you compute the correlation between S2 Herbal B and S2 Quire 13, I expect it will be lower than your S2 vs S3 correlation of 0.948 — despite being the same "scribe." That would mean the within-scribe variation exceeds the between-scribe variation, which is hard to reconcile with a multiple-scribe model but follows naturally from a continuous evolutionary gradient. I ran the comparison you suggested, and yes, your prediction is technically correct — but the separation is not large. Using Zandbergen/Landini:
Using Takahashi, where my other computations are from:
So, yes: S2 Herbal B is closer to S3 Quire 20 than to S2 Quire 13 in both transcriptions. But the size of the effect matters. In Zandbergen/Landini, the Pearson gap is only 0.024 and S2 Quire 13 vs S3 Quire 20 is 0.926, only 0.001 higher than the same comparison in Takahashi at 0.925. So this is not a large collapse of scribal structure. It is a small but real sign that section/regime can cut across scribal hand. That is actually compatible with what I am arguing: scribal hand, section, and production weighting are separable variables. All of my modeling to this point has focused on Scribe 1. To clarify that method before looking into the <ed> heavy later scribes. The chart below shows the same point visually. That <ed> does not divide perfectly by Davis scribe, but it does separate Scribe 1 very cleanly from the later <ed>-heavy sections. And, aside from the Herbal section which has scribe 2 sheets mixed into quires with scribe 1 sheets, you can almost pick out each section by scribe and <ed> density. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 20-05-2026 (19-05-2026, 10:25 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(19-05-2026, 10:03 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view. Word templates like mine do not capture all the constraints we see in the actual Voynichese lexicon. No matter how we define prefix, core, and suffix, these three parts are not independent. My model specifies a couple of constraints (the counts of benches and dealers) but there are more. Your own color tables show that, for one particular core, there are prefix-suffix pairs that are twice as common as expected, or half as common. I gave a couple of more dramatic examples of "skew squares" were Fr(AKX):Fr(BKX) is very different from Fr(AKY):Fr(BKY). There is still no explanation for how the Author could have created words with those hidden statistical anomalies, both for the seed text and during modification. The suggestion above does not seem to be very different from
All the best, --stolfi RE: A One-Page Ledger Method for Generating Voynich-Like Text - oshfdk - 20-05-2026 (20-05-2026, 03:44 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Word templates like mine do not capture all the constraints we see in the actual Voynichese lexicon. No matter how we define prefix, core, and suffix, these three parts are not independent. My model specifies a couple of constraints (the counts of benches and dealers) but there are more. Your own color tables show that, for one particular core, there are prefix-suffix pairs that are twice as common as expected, or half as common. I gave a couple of more dramatic examples of "skew squares" were Fr(AKX):Fr(BKX) is very different from Fr(AKY):Fr(BKY). While existing templates don't capture all words and the existing words only, hypothetically, if someone finds a relatively simple set of rules that would cover >95% of existing word types and would only produce < 5% of unregistered word types, is there a possibility to resolve all of this? Since I think many people have been focusing on glyph-based manipulations only, what if the right set of rules is more like "match the number of left and right semicircular curves in each word", "make sure the number of vertical lines matches the number of crossed strokes", "make sure the number of upwards flourishes per word is the same as the number of mimims"? I'm just inventing those, but given that individual characters appear to be designed by combining a few simple strokes in (almost) all possible ways, I think it wouldn't be reasonable to assume the rules, if any, can be formulated in terms of strokes and not in terms of glyphs. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 20-05-2026 (19-05-2026, 12:49 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.Your constructed language scenario ... makes predictions the VMS doesn't satisfy: You are not allowed to view links. Register or Login to view. that, by pure coincidence, is statistically similar in some aspects (like word structure and size distribution) to the VMS text. The text was not created by me; it was taken from a published and heavily used manual. It is not encrypted, merely written in a simple but intentionally obfuscated spelling system, using a Latin letter or digraph for each sound. The word spaces are as in the original. <%> and <$> delimit parags. The line breaks, other than parag breaks, were created by "fmt -s -w 50". Can you detect any "semantic clustering" of a kind that is not detectable in the VMS? Quote:A constructed language doesn't show a continuous evolutionary gradient. I have yet to see evidence that there is a "continuous evolutionary gradient" along the VMS. Sure, if one sorts the pages of a document in a way that minimizes the distances between consecutive pages in any document, the result will tend to show a continuous gradient. On the contrary, all tests that I have seen show that, if one merely separates the pages by section (treating Herbal-A and Herbal-B as two separate sections), what one sees instead is abrupt changes in word frequencies between sections. Like those one sees in any natural language text between sections on very different topics. Or even between sections on the same topic but written by different authors in a different style. And (as you noted) if the word frequencies change, so will the frequencies of glyphs and digraphs. Quote:A constructed language doesn't show line-boundary production effects. If words have fixed meanings, their form shouldn't depend on line position. But some words appear almost exclusively at line starts or line ends. These alleged "line-boundary production effects" have been discussed extensively in the LAAFU thread. One fact that has been recently pointed out is that in fact one should expect different word frequencies at line-start and line-end, because the probability of breaking a line before any given word greater for longer words, and because in a manuscript document the scribe will be more likely to use abbreviations (like, apparently, m -> iin) near line-end. We do not know yet whether these two content-insensitive processes can account for all anomalies seen around line breaks. And we do not know either whether these are the only content-insensitive processes that can produce those anomalies. Until these questions are resolved, the line-break anomalies cannot be used as evidence that the VMS text is not natural language. Quote:A constructed language with semantic prefixes doesn't show prefix-suffix independence. And neither does Voynichese: 56 otedy 56 oteedy 2 ytedy 12 yteedy 12 ched 11 ches 10 shed 0 shes 119 chey 11 ches 77 shey 0 shes 50 dain 30 dair 12 rain 2 rair All the best, --stolfi RE: A One-Page Ledger Method for Generating Voynich-Like Text - ReneZ - 20-05-2026 (20-05-2026, 03:44 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.There is still no explanation for how the Author could have created words with those hidden statistical anomalies, both for the seed text and during modification. Let alone why a great majority of possible modifications are never made. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 21-05-2026 (18-05-2026, 11:03 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.However the flow of the writing suggests that the writer did not pause after each word to think of what ought to come next. You are assuming that the VMS was written directly "from brain to vellum". But that is extremely unlikely. The Author surely composed the text as a draft on paper, which then was clean-copied to vellum. So the flow of writing does not tell how quickly the text was composed. (And many clues, as well as common sense, suggest that the copying to vellum was done by a Scribe who did not understand the text. All the best, --stolfi RE: A One-Page Ledger Method for Generating Voynich-Like Text - Jorge_Stolfi - 21-05-2026 (20-05-2026, 04:06 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.While existing templates don't capture all words and the existing words only, hypothetically, if someone finds a relatively simple set of rules that would cover >95% of existing word types and would only produce < 5% of unregistered word types, is there a possibility to resolve all of this? There may be, but until one is found, that remains a problem for all "gibberish" theories. Is there a simple enough model that generates the Voynichese words with the observed frequencies? Even if it is not the kind of model that a Medieval author could have devised and used? For any lexicon and any probability distribution of the words, can build a tree-like automaton with probabilities on the transitions that generates precisely those words with precisely those probabilities. But that model is as complex as a table with the frequency of each word. To reduce the complexity one could merge sub-trees, creating an automaton which is not a tree but a directed acyclic graph (a DAG). In that model, any state S that can be reached from the root by M different paths and leads to N final states represents a subset of the lexicon whose words consist of M prefixes combined with N suffixes, chosen with various probabilities but independently of each other. How many such subsets can we find in the VMS lexicon? Even allowing for, say, spelling and transcription errors on 15% of the words? I don't expect that such a model can approximate the whole lexicon with the observed frequencies. Maybe a model that is not a finite state automaton can do better. In my word model, the constraints on the total counts of dealers and benches can be implemented by keeping counters, reducing the size of the corresponding automaton by a large factor. But my model does not predict the word probabilities, only whether words are "valid" or not; and it still allows many "valid" words that don't occur at all. All the best, --stolfi RE: A One-Page Ledger Method for Generating Voynich-Like Text - Torsten - 21-05-2026 (20-05-2026, 10:39 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I have yet to see evidence that there is a "continuous evolutionary gradient" along the VMS. It is documented here: You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view. The gradient follows a specific path through specific intermediate forms — chol → cheol → cheo → chey → chedy — each attested with specific frequencies in a specific section order. The first step from <chol> to <cheol> even happens within Currier A. (20-05-2026, 10:39 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view. 56 otedy 56 oteedy The frequency-connectivity correlation arises through a feedback loop inherent in the copying process. Frequent words are more likely to be selected as copying templates, generating more variants; the existence of more variants increases the probability that members of that word family are selected in subsequent copying events. This self-reinforcing cycle ensures that the most frequently used words accumulate the most similar neighbors—precisely the pattern you document. RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 21-05-2026 (20-05-2026, 03:44 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.There is still no explanation for how the Author could have created words with those hidden statistical anomalies, both for the seed text and during modification. I am hesitant to define any Voynich statistics as “hidden anomalies.” Some may simply be emergent structure. Some of the Voynich statistics are likely the Texas sharpshooter fallacy. Fire bullets at a barn and then paint a target around the tightest cluster afterward. In my own tests, the Zipf-like curve and word-length distribution were not programmed in directly; they arose from the copy/mutate process itself. Word-length. Hapax count. Vocabulary size. All arose without any effort to specifically code for them. I would expect the same with other statistical patterns. If you dig deeply enough into natural language, or into generated text from a constrained local process, you will find uneven pairings, family clusters, missing combinations, and odd-looking skews. So the question is not whether anomalies exist. Of course they do. The question is whether they require a hidden semantic lexicon, or whether they can emerge from constrained local copying, reuse, and mutation. (20-05-2026, 11:37 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Let alone why a great majority of possible modifications are never made. No, the manuscript never explored every possible word combination. A realistic copy-mutate system would stay conservative, reusing active word families and nearby variants instead of wandering randomly through all legal forms. Natural languages work the same way. English could produce far more legal letter combinations than it actually does. And if someone really was trying to create a convincing hoax, making the text feel language-like would be the goal, not splattering every theoretically possible word across the page. That's gibberish. Even the ledger I demonstrate shows that thousands of never used combinations could exist. Constraint is why they don't exist. These words look English but don't exist. Humans could have created them, but never did. In our cases, the necessity wasn't there. In the Voynich, that same constraint applies whether you think it has meaning or not.
RE: A One-Page Ledger Method for Generating Voynich-Like Text - Dunsel - 21-05-2026 I reworked one of my comparison tests to operate at the sheet level. What this appears to show is that the manuscript is not behaving like a collection of unrelated pages. Certain sheets are much more similar to specific other sheets than to the manuscript as a whole, especially in Quire 13 and Quire 20. That is consistent with what I would expect from a copy-and-modify process, where words and word families are repeatedly reused and gradually altered over time, creating clusters of closely related sheets instead of uniformly random text. This alone does not prove a copy-and-modify system. A natural language text could also produce local clustering through shared topics, scribal habits, or repeated terminology. At the same time, not all sheets correlate most strongly with their immediate neighbors. Some show stronger affinity to more distant sheets, which is also consistent with my broader analyzer results suggesting that sheets may draw from small recurring source pools rather than strictly adjacent pages alone.
|