quimqu > 8 hours ago
| Hypothesis | Expected behavior | Observed failure |
| Random / weak structure | No stable local similarity or positional effects | Strong clustering and positional patterns persist |
| Sequential (Markov-like) | Next token predictable from previous ones | Bigram/HMM models add little or collapse |
| Copy–modify (parent-based) | Clear local derivations, strong nearest neighbor | Generative models produce too much similarity |
| Single dominant parent | One best local candidate per token | Multiple candidates with similar scores, no clear winner |
quimqu > 8 hours ago
| Assumption | Operational test |
| Candidate set constrained by line position | Predict top-k tokens from line features and measure recall |
| Local context restricts form, not identity | Build candidates by similarity and check if real token is included |
| Selection within the set is weak | Train ranking model and measure score flatness |
| Substructures drive compatibility | Use prefixes/suffixes only and test predictive power |
nablator > 8 hours ago
(8 hours ago)quimqu Wrote: You are not allowed to view links. Register or Login to view.Simple sequential models and naive copy-and-modify processes do not fit.
quimqu > 7 hours ago
(8 hours ago)nablator Wrote: You are not allowed to view links. Register or Login to view.I tried to propose a less naïve self-citation generation method including sparse initialization causing bottlenecks that would explain local inhomogeneities, lazy source words selection patterns, non-sequential writing, and generation rules optimized to fit the transliteration data. It seems that no one knew how to test the hypothesis. What do you think about it? You are not allowed to view links. Register or Login to view.
Such a method, if it can be shown to replicate all known properties of Voynichese, does not exclude the possibility of a cipher such as You are not allowed to view links. Register or Login to view., but unnecessary complications can usually be discarded by Occam's razor when there is no evidence specifically pointing at them.
nablator > 5 hours ago
(7 hours ago)quimqu Wrote: You are not allowed to view links. Register or Login to view.Most direct self-citation models tend to produce too much chaining and identifiable source–target links.
Quote:If you agree, I will try to model them and put them in my pipeline. Will let you know!
quimqu > 4 hours ago
(5 hours ago)nablator Wrote: You are not allowed to view links. Register or Login to view.I don't know what you mean by "identifiable". I believed that the frequency of some very unlikely sequential transfer patterns between two lines would almost certainly identify source-target links but then I was disappointed to find a similar frequency in Torsten Timm's generated_text file (see my last post in the other thread).