(13-04-2020, 08:12 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.Elie, you seem to have taken that very personally (and ignored the comment I made about the content of your paper), but I did not post it to disparage you. You assumed that. It's better to look at the big picture and not interpret everything on the forum as a personal jab.
I posted the link because Nick Pelling recently blogged about preprint servers and I don't think the one you used was on the list, so I provided a link so that people on the forum would be aware that there are more than what he mentioned and that it was the one you chose.
I do not judge the content of a paper by where it is posted. I judge it by what is written.
Yeah I was a little on the defensive sorry. I made the answer before reading your second comment on my paper.
To get back on the core subject, my paper sadly doesn't prove anything.
A way to have more information on the origins of VMS, would be to understand how someone can come up with these characters. Characters usually don't come from anything, except for made up languages of course. For real languages, they usually derive from an other language. From ancient greek to latin, we got
α -> a, β -> B etc.. The idea of my paper was that, if a language was close enough to VMS to have such association, then I could detect it because if we got words like "β[font=Tahoma, Verdana, Arial, sans-serif]α" in the original language, I had an algorithm that could associate back B>[font=Tahoma, Verdana, Arial, sans-serif]β and [font=Tahoma, Verdana, Arial, sans-serif]a>α[/font]. We had AncientGreek => Latin, so we should have X => Voynichese. This algorithm would have given X. The idea is to find X. Even if we found X1 with X => X1, or even if we only had X2 with X => X1 => X2. Knowing that X2 is related to Voynichese in a way or another would be an enormous success. My algorithm did find a lot of X, for example we have X1 => XFrench and X1 => XSpanich. My algorithm found that Spanish and French were related based on character association. Sadly, even if Greek and Latin are related, my algorithm didn't find this association. But it doesn't really matter, for each languages, I only need to find at least one close language.
Theoretically, all languages are linked to other languages (we don't often create a language from nothing then destroy it like it never existed, without it evolving). It doesn't necessary mean we have one tree, but usually we don't have one language unlinked to all others. This is what I studied in my paper. And indeed the algorithm worked for a lot of languages (latin-based, germanic, and even for asian languages). But the algorithm mostly failed for outliers of my statistical analysis (except old french and scottish gaelic). Why so?
[/font][/font]Outliers are human languages, but they have specificities. There are old languages (old french), languages from every countries (brazil, philippines, england, etc.), mostly from tribes, but most importantly, all close languages, all outliers seem to be, at some point, the restranscription of someone talking. Which would explain a lot for VMS.
Now, there is two possibilities I think. Voynichese was forged by someone to transcribe an unknown language he/she encountered. Voynichese was a real language that derived from others, and I still think the VMS is transcribed from someone talking. But I tried, and I didn't find a way to confirm the second hypothesis. So here are the tracks in my opinion for the two possibilities:
(1) VMS was forged to transcribe an unknown language. The main idea is "Hey, this guy speaks a language we don't know, we must invent new characters for his new language". In this situation, we can't trust characters. They refer to nothing we know, they were invented for the occasion. In this case, it will require a tremendous amount of work to translate VMS. If I was in this situation, I would invent letters and say "these N letters, put together, are for this syllable". We need to reconstruct this association. So the idea would be to identify clusters of letters, and to associate groups of letters from one language to groups of letters from another language. The same way I did with characters, but this time for hypothethic syllable. This would probably be more robust for an analysis than computerized transcription like V101 or EVA, because errors like "m" in V101 being 'iin' in EVA would be gathered in clusters in this model, individual errors on segmented letters wouldn't matter.
(2)
Voynichese was a real language that derived from others. In this case, I think our best chance is to look on characters. I already tried computer algorithm with more than 80 languages to find a character association, so the quantitative analysis is probably a dead end. We need a qualitative analysis. We need people looking at every set of characters that existed in the world to find which could be the closest to voynichese. People even looked at Komi Zyrian ( You are not allowed to view links. Register or Login to view. ). This kind of tracks seems promising to me. The other solution would be to have one expert in each languages do what I did in my paper between VMS and OldFrench. I managed to translate more than 15% of unique VMS words, and it only gave garbage so the only way to know if the algorithm find an association between VMS and other language is probably to have an expert (native of the language) looking at it. And all of that was with V101 but it would be required to be done with EVA. It's highly time-consuming and I would deter doing that without any further qualitative analysis on the origins of VMS.
===> What's important to get is that, for all the outliers I got, only a few could be understood with the second method. If you take Mbya Guarani for example, it is based on oral transmission. You have X => MbyaGuarani, but you almost don't have any information on X. You can't do an analysis based on that. Even if we had X (oral) => Voynichese (written) and X (oral) => X1 (oral) => X2 (oral) => (...) => X15 (written). It could mean we wouldn't be able at all to associate X15 and Voynichese, whatever the method we use is, in this case VMS would probably be a forever mystery, this is truly the worst scenario. Voynichese was written at some point to give us VMS, but if the sound someone made which someone else wrote doesn't represent anything at all for us today, we can't translate VMS (even if it meant something at some point). VMS would be a dead end, an unsolvable mystery forever (almost, the rest of our chances would be artificial general intelligence and method to go back from X15 to X, all languages have things in common but for Voynichese as for some outliers, we can't hope to base our algorithms on that to understand it). The first method could work for all hypothesis in fact, it's just that it's probably extremely hard to implement, validate, and have enough computational power to test it efficiently.
To summarize everything I would just say: How would you attack Mbya Guarani or Tagalog if you only had a romanized text of these languages at your disposition? Could you translate it? I think this is the real question to ask ourselves. Look at that: You are not allowed to view links. Register or Login to view. . Translate this without any other information. One of my idea would be to try finding key words like names or cities, no matter how they would be represented. In this retranscription of Mbya Guarani, you have for example "Ko'apy roju rire ma , ou kuri cheñora , Patricia Madre Tierra oje'ea .", which is an enormous amount of information. Now imagine this, with other characters, not knowing the real association between visual symbols and characters (v101 & eva). This would also be a great way to find a solution. But even with this, you would only know how to map unknown characters to known characters, it doesn't mean you will be able to understand Mbya Guarani. But from this you could at least understand the sound made when talking voynichese, and go from that to the original language.