Labyrinthinesecurity > Yesterday, 08:38 AM
Rafal > Yesterday, 11:30 AM
Would you like to make some summary of your results in simple words?
nablator > Yesterday, 01:12 PM
Labyrinthinesecurity > Yesterday, 01:40 PM
(Yesterday, 01:12 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.Quote:We make our code and data available for independent verification.
That would be helpful.
I couldn't find a different repository on kaggle than the one for the directionality calculation: You are not allowed to view links. Register or Login to view.
Labyrinthinesecurity > Yesterday, 02:30 PM
(Yesterday, 11:30 AM)Rafal Wrote: You are not allowed to view links. Register or Login to view.This is some complicated stuffWould you like to make some summary of your results in simple words?
oshfdk > Yesterday, 02:37 PM
(Yesterday, 02:30 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Concretely, we propose 4 criteria that any Voynich script "generator" (like Naibbe) should meet to really look like Voynich. These are necessary criteria, not sufficient ones.
Quote:Boundary concentration. The 80.6% end→start transition rate exceeds the four comparison languages by a factor of 2.3–4.1×. This means VMS word boundaries enforce a rigid alternation between positional grapheme classes that is quantitatively different from these four languages. The gap could narrow for morphologically richer languages not yet tested.
Bilateral positional extremity.
Extreme positional ratios (>100:1) span both start-preferring and end-preferring grapheme classes, unlike the four comparison languages where such ratios clus- ter in one direction with specific orthographic explanations. This bilateral pattern is consistent with a system where word-initial and word-final positions draw from largely non-overlapping grapheme pools, but we cannot exclude the possibility that some untested natural languages exhibit similar properties.
Zipfian boundary distributions. VMS word-boundary grapheme distributions follow a Zip- fian curve rather than the plateau shape observed in the four comparison languages (Parisel, 2025). This is a fundamental property of VMS word structure.
High cross-boundary MI with high structural residual. The VMS has the highest total cross-boundary MI (0.230 bits) and the highest MI retention after word-order shuffling (21%) among the five corpora tested. The high retention indicates that a substantial fraction of cross- boundary predictability is structural (order-independent).
Labyrinthinesecurity > Yesterday, 02:55 PM
(Yesterday, 02:37 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 02:30 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Concretely, we propose 4 criteria that any Voynich script "generator" (like Naibbe) should meet to really look like Voynich. These are necessary criteria, not sufficient ones.
I'm not sure I understand which 4 criteria you mean, these below?
Quote:Boundary concentration. The 80.6% end→start transition rate exceeds the four comparison languages by a factor of 2.3–4.1×. This means VMS word boundaries enforce a rigid alternation between positional grapheme classes that is quantitatively different from these four languages. The gap could narrow for morphologically richer languages not yet tested.
Bilateral positional extremity.
Extreme positional ratios (>100:1) span both start-preferring and end-preferring grapheme classes, unlike the four comparison languages where such ratios clus- ter in one direction with specific orthographic explanations. This bilateral pattern is consistent with a system where word-initial and word-final positions draw from largely non-overlapping grapheme pools, but we cannot exclude the possibility that some untested natural languages exhibit similar properties.
Zipfian boundary distributions. VMS word-boundary grapheme distributions follow a Zip- fian curve rather than the plateau shape observed in the four comparison languages (Parisel, 2025). This is a fundamental property of VMS word structure.
High cross-boundary MI with high structural residual. The VMS has the highest total cross-boundary MI (0.230 bits) and the highest MI retention after word-order shuffling (21%) among the five corpora tested. The high retention indicates that a substantial fraction of cross- boundary predictability is structural (order-independent).
oshfdk > Yesterday, 03:03 PM
(Yesterday, 02:55 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Yes, these are the ones
Jorge_Stolfi > Yesterday, 03:29 PM
(Yesterday, 02:30 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Concretely, we propose 4 criteria that any Voynich script "generator" (like Naibbe) should meet to really look like Voynich. These are necessary criteria, not sufficient ones.
eggyk > Yesterday, 03:57 PM
(Yesterday, 03:29 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.And, finally, statistics are a property of a text, not of a language. There is no such thing as "the frequency of 'e' in English" or 'the most common Engish word'. Someone wrote a whole novel in English without using 'e' even once -- and readers don't notice unless they are told. In a materia medica the most common word may well be "take" or "cures", and the word "the" may hardly be used...
All the best, --stolfi