(31-01-2018, 11:19 AM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.I still don't understand why some people are mildly positive about this paper. I'm not a coding expert, but I do know that abjad anagramming is a one way cipher. And that plonking your results into google translate is an absolutely embarassing thing to write about in a scientific paper. Am I missing something?
I have been developing software since the early 1980s, have worked on various things including Good Old Fashioned AI (i.e. small data, programming in Lisp and Prolog), speech processing, search engines, and data mining including collaborative filtering. I have also carried out a You are not allowed to view links.
Register or
Login to view., reception of which has been mixed: almost entirely negative here (which has discouraged me from pursuing my ideas further), but more positively from a few others including academics. We all have our own, personal, Overton windows, and limited time and knowledge. I'm aware of this and I act accordingly. Sometimes a theory is irredeemable, but often all that's needed are some minor alterations.
There's a good paper hidden inside the published paper if you read it carefully. They have shown that meaningful text in a known (i.e. in the corpus of candidate languages) but unidentified language, which has been encrypted using any combination of three steps: vowel removal, anagramming, and simple substitution encipherment, can be recovered using their method. This is an interesting and perhaps surprising result.
The problems start when they apply it to the Voynich Manuscript. If the Voynich Manuscript is meaningless, their method will still find a spurious closest match. Even if it's meaningful and encrypted using only those three steps, using a modern language dataset (the You are not allowed to view links.
Register or
Login to view.) will also result in a spurious match. What they should have done is use a dataset of substantial texts in languages known in the old world in the early 15th Century, and in addition to the Voynich Manuscript (but not all of it -- see later), apply their method to texts known to be meaningless (e.g. You are not allowed to view links.
Register or
Login to view.) to determine a baseline for acceptance. A match between the Voynich Manuscript and a candidate language can only be taken seriously if it scores higher than any of the matches between the bogus texts and their candidate languages.
If they did all that, and identified a plausible language, they should then get someone (a real person who knows the language) to attempt to translate several pages randomly selected from the parts of the manuscript they didn't use to identify the language. If they can't produce a meaningful translation, they should draw the conclusion (itself extremely useful) that the Voynich Manuscript is either meaningless text, or is not in one of the candidate languages, or is encrypted using a different method. And that would be a very useful result.
I have nothing positive to say about the reporting of their paper in the popular press.