02-06-2026, 04:00 AM
(02-06-2026, 12:33 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I’ve been thinking about the Types/Hapax problem. Ultimately, I’ve come to the conclusion that approximately 11 words per page are likely to be filler words or erroneous words. These cannot be replicated within the model but can only be simulated using a fixed list (which, in principle, could be any list). I’m not really satisfied with this “insight,” since it ultimately amounts to an unverifiable assumption.
Oh, but my generator does replicate both types and hapax and does a pretty decent job getting the numbers close. In copy/mutate, that should be an emergent property. Not something you force the generator to do.
Calling them filler might be legitimate, even if the Voynich is some mnemonic or shorthand. But, if it is gibberish and a hoax then every word is filler and every word is... erroneous.
Wait till you start trying to get gallows right.