Dunsel > 7 hours ago
(10 hours ago)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I’ve been thinking about the Types/Hapax problem. Ultimately, I’ve come to the conclusion that approximately 11 words per page are likely to be filler words or erroneous words. These cannot be replicated within the model but can only be simulated using a fixed list (which, in principle, could be any list). I’m not really satisfied with this “insight,” since it ultimately amounts to an unverifiable assumption.
bi3mw > 40 minutes ago
(7 hours ago)Dunsel Wrote: You are not allowed to view links. Register or Login to view.But, if it is gibberish and a hoax then every word is filler and every word is... erroneous.
Wait till you start trying to get gallows right.