tavie > 29-06-2026, 10:55 PM
(29-06-2026, 06:14 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Anyway, let me insist again that statistics (glyph and word frequencies, Zipfness, entropies, correlations etc) are not properties of languages, but of texts. In any language one can have texts that have much lower or much higher word entropy than a typical novel or philosophical treatise.
Jorge_Stolfi > 29-06-2026, 11:29 PM
(29-06-2026, 08:34 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.what you say is strictly true, but in practice the statistics of a text are (barring rare outliers, as the famous English novel written without a single 'e') mostly governed by the underlying language (which includes, of course, its specific orthography) and not by the text. Once a language has been chosen it's usually possible to say if a given text is written in that language or not just by checking the most basic statistics.
Jorge_Stolfi > Yesterday, 03:40 AM
(29-06-2026, 10:55 PM)tavie Wrote: You are not allowed to view links. Register or Login to view.The problem with Voynich solutions is that solvers* do not explain how their target language has such a low entropy when written in the Voynich script and would presumably in its usual script have a higher entropy.
Quote:In most cases, I would imagine they cannot explain, because their solution is actually not just a simple substitution system but a product of multiple, diverse, and fatally inconsistent decisions to add or remove letters in order to match Voynichese words with their target language, and none of this can be formalised and reproduced.
MarcoP > Yesterday, 05:40 AM
tavie Wrote:[Edit: Getting back on topic here, what I find interesting is how different /sh/ and /ch/ can behave while being so similar. Despite making similar words like shedy, chedy, sheey, cheey, etc, they definitely have their own "personality" in the manuscript. /sh/ in particular appears much more in the Top Row of paragraphs than you would expect from its performance in lower lines. Meanwhile /ch/ tends to go from being a common initial and a rarer word-middle to being a rarer initial and a common word-middle, forming words such as opchey.]
Mauro > Yesterday, 01:37 PM
(29-06-2026, 11:29 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I insist that this claim holds only for texts of the same type, namely texts that consist mostly of discursive or narrative prose -- like novels, chronicles, philosophical treatises, newspaper and magazine articles, even textbooks. Like all the books in your example.
Jorge_Stolfi > Yesterday, 04:54 PM
(Yesterday, 01:37 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.my example uses a very diverse panel of texts.
JoeyB > Yesterday, 05:18 PM
(Yesterday, 04:54 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I still owe you an example of a text in Italian that would not be free prose like those in your list. Working on that...
All the best, --stolfi
Mauro > 9 hours ago
(Yesterday, 05:18 PM)JoeyB Wrote: You are not allowed to view links. Register or Login to view.Maybe Libro de arte coquinario would work since it's recipes and such... Library of Congress pdf images here and I imagine there are text or github repos of it in the wild... You are not allowed to view links. Register or Login to view.
JoeyB > 5 hours ago
(9 hours ago)Mauro Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 05:18 PM)JoeyB Wrote: You are not allowed to view links. Register or Login to view.Maybe Libro de arte coquinario would work since it's recipes and such... Library of Congress pdf images here and I imagine there are text or github repos of it in the wild... You are not allowed to view links. Register or Login to view.
Maybe yes, but I cannot process images, I need a textual transcription.
Jorge_Stolfi > 2 hours ago
(Yesterday, 04:54 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I still owe you an example of a text in Italian that would not be free prose like those in your list.