(31-05-2026, 03:20 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.The results are showing that after same-page copy/mutate are removed, the remaining source words repeatedly collapse into a very small number of sheets as the source of those pages.
Again, this is circular. You are assuming that copy-mutate generated the patterns it in the first place rather than proving it. The finding here is
suggestive and
compatible with copy-mutate, but if it owes to something else, you have not "removed" "same-page copy/mutate", you have removed some other feature.
(31-05-2026, 03:20 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.As for gallows stripping, that was done because gallows are overwhelmingly productive in prefix positions. If gallows behave as a frequent prefix operation rather than as ordinary internal characters, then treating chol, kchol, tchol, etc. as completely unrelated forms obscures rather than reveals the family structure.
I agree that this is a good idea. This was also my feeling, that gallows are not part of the structure and are sometimes added to create variants or as an afterthought as a connector/filler to make longer words out of two words. They need several slots if you want to allow them in a slot sequence. The easiest solution is to get rid of them.
(12-05-2024, 01:27 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.The complication of gallows dancing differently around q and ch can be resolved by removing gallows from the slot sequence.
About slot sequences, what is your interpretation of the excellent correlation of word frequency to matching a slot sequence, either Massimiliano Zattera's original 12-slot one or a simplified one with a lower number of slots? What constraining mechanism could explain it? It's not a natural thing that anyone would do without a reason, nor a chaotic thing that just happens randomly. Exceptions are numerous but not among high frequency word types.
The 5-50 range is the minimum number of times that the word type occurs in the VMS, i.e. for 5 I kept all word types that occur at least 5 times.
[
attachment=15858]
(31-05-2026, 03:13 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.When you claim the tokens in fv1 are the product of such and such a process, you are no longer claiming to have modeled them. You are claiming to know as a matter of historical fact what the scribe did. This is an extremely strong claim for the banal reason that it's hard to prove what someone did 600 years ago, but there are several other factors that make inferring from the model hard.
I have stated repeatedly in these posts that I am not claiming a solution. I'm simply trying to present facts and data as I have discovered them. Do I think copy/mutate is viable? Very much so. If I have made any claims of a solution then I apologize. It's really hard to parse my words in these posts so that misinterpretation isn't instant. You can see what success I'm having at that.
Again:
I MAKE NO CLAIM THAT I HAVE A SOLUTION! I have evidence for a possible method. Does this exclude any other possible explanation? Other than perhaps that it was a secret document conjured up by the CIA, no, it does not. If in the past or in the future I seem to be making some claim to a solution or historical fact then it's a mis-wording.
(31-05-2026, 03:13 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.A genuine path forward for these models would be to show that they depend less on the Voynich's summary statistics. Your analysis of the gallows letters, which I do not believe is shared by Timm and Schinner's model, is a case in point.
...
If that's all there is to it, it may be the best we can do is say that their process had a bias towards gallows in line start words and we may not be able to formally separate premise (we observe the bias) from conclusion (it is a product of an arbitrary choice).
Right now, I'm simply stripping gallows to see if there's an underlying structure. Am I treating them as being decorative? Yes. Is that what I believe? Perhaps. Is this work proof of such? Far from it. But the tests I've run are showing that if gallows are stripped, there's still underlying word families associated with them.
(31-05-2026, 03:13 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.By the by, if you are in fact "reproducing" the Voynich, failure to incorporate Currier's curve-line system observations and similar analyses seems like fair game to me. It also strikes me as the kind of "observable orthographic structure" you say the model addresses in in 7.6 of your paper. This is largely an aside to the main point here, which is that I don't think you're proving your interpretation, but I'm not clear why statistics about letter bases are not part of the orthographic structures in the Voynich.
Yes, that's orthographic structure. No, I haven't modeled it yet. I had this objection about CLS earlier. No, I'm not modelling it either. Failure to model those does not invalidate the copy/mutate + ledger results or the source sheet evidence I've presented. At most, it suggests that if copy/mutate is a viable explanation, there may be additional constraints operating on the mutation process that still need to be identified.
(31-05-2026, 03:34 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view. (31-05-2026, 03:20 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.The results are showing that after same-page copy/mutate are removed, the remaining source words repeatedly collapse into a very small number of sheets as the source of those pages.
Again, this is circular. You are assuming that copy-mutate generated the patterns it in the first place rather than proving it. The finding here is suggestive and compatible with copy-mutate, but if it owes to something else, you have not "removed" "same-page copy/mutate", you have removed some other feature.
Again, wording.
“After removing same-page ED0/ED1-derived forms, the remaining core tokens repeatedly collapse into a very small number of source sheets.” That is not meant to prove the historical cause by itself. It is meant to show that the page structure is consistent with a local copy/mutate process and that the remaining source burden is surprisingly small. If some other process creates the same local ED0/ED1 dependency structure and the same source-sheet collapse, then yes, that alternative would also need to be considered. My point is not that the label proves the method.
My point is that the observed structure is consistent with what a constrained copy/mutate process would be expected to produce.
Is copy/mutate the only possible explanation? No. Go look in the Solutions forum and you'll find plenty. And that is why this is in analysis and not solutions.
(31-05-2026, 04:10 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.At most, it suggests that if copy/mutate is a viable explanation, there may be additional constraints operating on the mutation process that still need to be identified.
There should be an incentive to adhere to rules (proposed rules include CLS (and variants), word grammars (many flavors), slot sequences (MZ or a variant) but not strictly.
(31-05-2026, 04:00 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.I agree that this is a good idea. This was also my feeling, that gallows are not part of the structure and are sometimes added to create variants or as an afterthought as a connector/filler to make longer words out of two words. They need several slots if you want to allow them in a slot sequence. The easiest solution is to get rid of them.
And that's the working theory. In this post: You are not allowed to view links.
Register or
Login to view., that's essentially what I'm working on there. Syllabic chunks in perhaps some sort of sequence. Right now, there's no formal solution to the Voynich, which I think we can all agree on. Which means that much of the work to this point on the Voynich is incomplete or partial. If I can independently reproduce some slot system or method that already exists, great. It's confirmation. Until I do or until I run out of tests and ideas, I'm trying to remain unbiased and with a fresh set of eyes. So, I'm trying real hard not to delve into the work of others and see if this fresh approach doesn't create a few new ideas.
I'm pretty sure the sheet source I've come up with is a rather unique possibility.
(31-05-2026, 04:21 PM)nablator Wrote: You are not allowed to view links. Register or Login to view. (31-05-2026, 04:10 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.At most, it suggests that if copy/mutate is a viable explanation, there may be additional constraints operating on the mutation process that still need to be identified.
There should be an incentive to adhere to rules (proposed rules include CLS (and variants), word grammars (many flavors), slot sequences (MZ or a variant) but not strictly.
If you're saying that any viable explanation should eventually account for observed constraints such as CLS, slot patterns, grammars, or similar phenomena, I have no disagreement with that. If you're saying that a model or theory must explain those observations before evidence can even be considered, then I disagree.
(31-05-2026, 04:45 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.If you're saying that any viable explanation should eventually account for observed constraints such as CLS, slot patterns, grammars, or similar phenomena, I have no disagreement with that. If you're saying that a model or theory must explain those observations before evidence can even be considered, then I disagree.
The knee-jerk reaction (it was mine too but I changed my mind) is to reject copy-and-mutate methods of generation because they are incomplete: some incentive is needed to account for adherence to relatively rigid patterns. This can't be phonetic preferences only, the positional rigidity goes far beyond that. How the two apparently contradictory aspects can coexist is the question: good adherence to strict rules in general especially for high-frequency words but many exceptions show that some amount of chaotic behavior is allowed (or we don't understand the rules). Can mutation rules alone produce this behavior or is there a need for a separate set of almost-good-enough rules? Occam's razor favors emergent behavior from the copy-and-mutate theory alone, this would be ideal, much better than having an additional layer of constraints.
(31-05-2026, 05:09 PM)nablator Wrote: You are not allowed to view links. Register or Login to view. (31-05-2026, 04:45 PM)Dunsel Wrote: You are not allowed to view links. Register or Login to view.If you're saying that any viable explanation should eventually account for observed constraints such as CLS, slot patterns, grammars, or similar phenomena, I have no disagreement with that. If you're saying that a model or theory must explain those observations before evidence can even be considered, then I disagree.
The knee-jerk reaction (it was mine too but I changed my mind) is to reject copy-and-mutate methods of generation because they are incomplete: some incentive is needed to account for adherence to relatively rigid patterns. This can't be phonetic preferences only, the positional rigidity goes far beyond that. How the two apparently contradictory aspects can coexist is the question: good adherence to strict rules in general especially for high-frequency words but many exceptions show that some amount of chaotic behavior is allowed (or we don't understand the rules). Can mutation rules alone produce this behavior or is there a need for a separate set of almost-good-enough rules? Occam's razor favors emergent behavior from the copy-and-mutate theory alone, this would be ideal, much better than having an added layer of constraints.
I think we're actually much closer in viewpoint than it may appear.
The question I'm becoming interested in is exactly whether those constraints are emergent or imposed. If copy/mutate requires a separate slot table, grammar, CLS engine, and a dozen additional rule systems, then it becomes much less attractive as an explanation.
If, on the other hand, a relatively small set of positional constraints, inherited word families, and local mutations naturally produce the observed structure, then that is a much stronger result. That's one reason I've been spending time looking at chunk behavior and provenance chains rather than simply adding someone else's rules to the generator. I'd much rather discover that a pattern emerges naturally than hard-code it into the model.
Unfortunately, the best explanation I have so far that fits every observation requires varying amounts of alcohol.
Which, now that I think about it, may be the first theory I've encountered that also explains the stains at the top of the Herbal section.
I’ve been thinking about the Types/Hapax problem. Ultimately, I’ve come to the conclusion that approximately 11 words per page are likely to be filler words or erroneous words. These cannot be replicated within the model but can only be simulated using a fixed list (which, in principle, could be any list). I’m not really satisfied with this “insight,” since it ultimately amounts to an unverifiable assumption.
[
attachment=15875]
Short words : 1438 (15.1%)
Common words: 3933 (41.4%)
Common combinations: 2017 (21.2%)
Hapax (from Pool): 1103 (11.6%)
Ledger-words: 1009 (10.6%)
Total: 9500
Hapax total: 1844
Types: 2813