The Voynich Ninja
[Article] Analysis of the relation between words within the Voynich Manuscript - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: News (https://www.voynich.ninja/forum-25.html)
+--- Thread: [Article] Analysis of the relation between words within the Voynich Manuscript (/thread-3891.html)



Analysis of the relation between words within the Voynich Manuscript - Torsten - 08-11-2022

There is a new paper published about the VMS: You are not allowed to view links. Register or Login to view.

The author is Andrew Caruana from the University of Malta. (Note: Andrew Caruana will present a paper under the same name at the Conference @ Malta)

The paper compares word pairs and their frequencies in natural languages and the VMS. The paper comes to the conclusion: 
Code:
the results for the Voynich were a little more even with the randomised version only scoring roughly half as much as the normal variant. This may suggest that the manuscript is not randomly generated text, however it could point to the Voynich being some sort of code or cipher.



RE: Analysis of the relation between words within the Voynich Manuscript - RenegadeHealer - 08-11-2022

(08-11-2022, 08:28 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.There is a new paper published about the VMS: You are not allowed to view links. Register or Login to view.

The author is Andrew Caruana from the University of Malta. (Note: Andrew Caruana will present a paper under the same name at the Conference @ Malta)

The paper compares word pairs and their frequencies in natural languages and the VMS. The paper comes to the conclusion: 
Code:
the results for the Voynich were a little more even with the randomised version only scoring roughly half as much as the normal variant. This may suggest that the manuscript is not randomly generated text, however it could point to the Voynich being some sort of code or cipher.

Good find.

I wish the experimenter had used an actual randomly generated pseudotext specimens (or two or more, each generated using different algorithms) for comparison purposes. That might have done a lot to put the clearly anomalous VMs results of this experiment’s main statistical test in context. I think it’s likely that many random pseudotext generation algorithms produce output where the “token A token B” to “token B token A” ratio is consistently >1, or even >>1. This is (or rather, could potentially be) an unintentional side product of an algorithm’s design and execution, rather than a reliable indicator of meaningfulness.

As a matter of fact, I suspect the results of this experiment by Caruana et al. actually argue against the VMS’s text being meaningful. A roughly two-to-one AB:BA ratio for tokens clearly deviates in a statistically significant fashion from this same token ratio in any text specimen examined whose meaningfulness is not in dispute. On the other hand, I fear that a two-to-one AB:BA token ratio may not be a statistically significant deviation that of stochastically pseudotext specimens of comparable length.

I’m quick to blame wishful thinking for Caruana’s methodological errors and less-than-tenable conclusions. I share his desire to see hard evidence that the VMs is almost certainly meaningful. But his study doesn’t necessarily help that cause.

P.S. It would interest me very much to see Caruana’s statistical measure applied to a specimen of meaningful text written in Toki Pona, or another oligosynthetic conlang with phonology, morphology, and syntax / word order all kept as simple as practically possible. Because I’m much taken, of late, with Hermes777’s theory that this is exactly what the VMS’s text is.


RE: Analysis of the relation between words within the Voynich Manuscript - nablator - 09-11-2022

(08-11-2022, 08:17 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.A roughly two-to-one AB:BA ratio for tokens clearly deviates in a statistically significant fashion from this same token ratio in any text specimen examined whose meaningfulness is not in dispute. On the other hand, I fear that a two-to-one AB:BA token ratio may not be a statistically significant deviation that of stochastically pseudotext specimens of comparable length.

The results probably depend on the choice of thresholds for selecting pairs "above a certain frequency threshold" and deciding when the inverse pair "occurs much more frequently".

Marke Fincher's Word Pair Permutation Analysis on the other hand doesn't require any threshold. You are not allowed to view links. Register or Login to view.