The Voynich Ninja

Full Version: Word Probability Findings in the Voynich Manuscript
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
There is a new paper from 2020 about the VMS: "You are not allowed to view links. Register or Login to view."
The paper is available from the You are not allowed to view links. Register or Login to view. website: You are not allowed to view links. Register or Login to view.

The authors demonstrate that less probable words in the VMs "not only contain more sounds, they also contain sounds that convey more disambiguating information overall". With other words they demonstrate that low-frequency word types also tend to have low numbers of similar words.
(13-05-2020, 04:32 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.You are not allowed to view links. Register or Login to view. 

Thank you for bringing this conference to my attention, Torsten. Just reading a description of this event and the group that organized it gives an amateur language and linguistics geek some serious kid-in-a-candy-store vibes! Do you happen to know any websites where I might find announcements about similar-themed conferences in the future (anywhere in the world), once this pandemic eases up?
If I understand your description of the article I can't say that those findings surprise me.
The question that interests me is to what extent these words lie on a continuum.
(13-05-2020, 06:53 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.If I understand your description of the article I can't say that those findings surprise me.


Indeed, it's a low hanging fruit. Moreover, Montemurro & Zanette already provided a more meaningful result back in 2013: "the words that are more strongly connected have an evident morphological similarity" (You are not allowed to view links. Register or Login to view.).
(13-05-2020, 07:43 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.The question that interests me is to what extent these words lie on a continuum.

I have researched this question. There is a fundamental connection between token frequency and number of similar word types. A word type without similar words is always unique and for high frequent types always numerous similar word types exists. I tried to illustrate this relation in my You are not allowed to view links. Register or Login to view. by mapping the frequency of a word to its font size.
"In this paper, we showed more support for the claim that the VM is written in a natural language and therefore is not a hoax. Although several scholars have found statistical evidence pointing in the same direction, more evidence is needed, particularly to establish whether there is a known language family to which VM can plausibly be assigned."  ---[font=TimesNewRomanPS]Colin Layfield, Lonneke van der Plas, Michael Rosner, John Abela[/font]

Most of the language-related tests of Voynichese that depend on EVA transliterations include a number of fundamental assumptions that can skew the results

For example

EVA converts the c-shapes to e-shapes for mnemonic purposes. Mnemonic text systems almost always depend on converting the VMS glyphs to something that is closer to natural language so it can be more easily remembered. To put it simply: EVA CHANGES Voynichese shapes to be closer to natural language. Many computational attacks assume the vowel-like shapes are vowels.

Unfortunately, one researcher after another uses this mnemonic system (which increases the vowel shapes in the VMS) to test vowel-consonant balance, or to test other aspects of natural language that include vowel-consonant assumptions based on EVA.


For another example

There is no hard evidence yet that token lengths in the VMS = words. Many computational attacks make this assumption, as apparently does this paper.



I saw nothing in this study's methodology that explained why they chose the EVA transliteration, and there was no discussion regarding its limitations for computational attacks or why they accepted its built-in assumptions.

They did acknowledge that they used modern languages for comparison.