![]() |
Which plaintext languages to try? - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: Which plaintext languages to try? (/thread-4468.html) Pages:
1
2
|
RE: Which plaintext languages to try? - ReneZ - 25-01-2025 (25-01-2025, 09:27 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I don't know whether many manuscripts were created in Czech in the early XV century. Yes there were. However, this level of language distinction (Czech vs. German vs. Latin) is not relevant. They all behave in similar ways (20+ alphabetic symbols, vowels and consonants) which does not match Voynichese at all. Whatever transform can convert Voynichese to any of these languages, will work equally well for all of them. I have used the following comparison before, but probably not many people have seen it. Converting Voynichese to a known language is like travelling trom Australia (Voynichese) to a city in the UK. Your trip does not depend a lot on whether you want to go to Wigan (Czech), to Watford (German) or to Hartlepool (Latin). You first need to figure out how to travel from Australia to London. The rest is details. More importantly even, I would strongly advise to include all arabic languages and Hebrew in one's consideration. These have barely been taken into account until now, so why ignore such relatively unexplored areas. RE: Which plaintext languages to try? - oshfdk - 25-01-2025 Thank you, Marco and Rene, for the feedback. I still feel it's about time I started thinking about plaintext languages. Let me try to explain. I know that the probability of myself finding the solution to the Voynich cipher (if it is a cipher) is extremely small, since the road is well trodden and I'm not a codebreaker or a historian. However, the probability is not exactly zero, as long as I'm trying. I've been doing some hobby research of the Voynich Manuscript for a little over than 3 years now, in my free time just for fun. Last year I started the Voynich marathon to speed up my investigation, mainly because I thought it would be a shame if the Voynich Manuscript is first decoded by an AI. There is no deep logic in this, this is just the way I feel about it, and this is why I don't use AI (any black box solution, the workings of which I don't fully understand) in my attempts of cracking the manuscript. I strongly believe, that AIs with very high probability will surpass the average human on most mental tasks in very short future, and it is likely they will surpass the best of humankind on almost any task not very much longer afterwards. However, there is a very obvious problem with assessing the actual capabilities of modern AIs: you never know what they have seen during the training. Modern AIs can easily process texts in hundreds of languages and their training database consists of billions of pages of text, collected from all over the Internet. The AIs also are not bad at bluntly transferring some ideas between different domains (say, applying chemistry to cooking, AIs are not consistent, but sometimes feel as if they were creative). Suppose, an AI comes up with an interesting novel result in some branch of maths, you can never be sure whether its discovery is original or a retelling of some post on a related topic from an obscure math forum of a local university in some province of China. It's very hard to find a task for which we can be sure that the AI has never seen it being solved by a human before. AIs cannot reliably produce references for the data they used, even when they pretend they do. This is simply technically impossible for a modern mainstream AI to identify the sources of its reasoning, given the way the training is performed. And then it occurred to me that the Voynich Manuscript itself is a very good task to test AIs on, provided there is a known solution and parts of the solution are not in the training data. The task is certainly not simple, the manuscript itself is freely available as the full set of images, so it's a self-contained hard task. Also, the Voynich manuscript looks like a very well isolated task, basically you either solve the Voynich manuscript or do something else entirely, so it's unlikely there are ideas that can be copy-pasted from some other field. So I thought about maybe turning my irrational dislike of the idea of AI solving the manuscript into something potentially useful. It was about the same time (December 2023), that when preparing my fourth article in the Voynich marathon I stumbled upon an idea that I hadn't seen before and that seemed to explain properties of the manuscript quite nicely and looked like something that could lead to a solution. I wanted to share it, but then I thought maybe it would indeed be useful if I tracked this idea to the very end without leaving any breadcrumbs on the internet. In an unlikely case that this turns out to be the solution, not only will have it, but we also would have a very clear cut off date for when the specific information about the solution appeared in the training data. This looks like a perfect benchmark for AIs. I'm not sure if someone would actually spend resources on this, but in case the manuscript is actually solved, there would be some free publicity and the idea could get some traction. And in case it's not solved, the way in which it wasn't solved is not that important anyway. So, I stopped publishing Voynich marathon articles and in my free time continued to pursue the solution. Considering your comparison with Australia to UK journey, I believe that I currently could be somewhere near the coast of Portugal, and it's already appropriate to think whether to turn into the English channel or go up and sail around Ireland. That is, if I'm right. If I'm wrong, then I very well could be somewhere near San Diego, and I have no hard feelings, it has been a pleasant journey anyway. Just to make it clear, I'm still not sure whether I'm on the right path, it all looks very plausible, but I have no definite answer to any questions yet, including any specific words, images or marginalia of the manuscript. RE: Which plaintext languages to try? - Addsamuels - 26-01-2025 (25-01-2025, 01:13 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Thank you, Marco and Rene, for the feedback. I still feel it's about time I started thinking about plaintext languages. It is also certain that if the language was a plaintext language, then it would be an archaic or obsolete form of this language or potentially a dialect. The languages I chose, was based on a chart from Dr L F Davis and the provenance of a likely European language. They also form multiple language families too (although they are all indo-European, and the manuscript like is too, if and only if it has a plain text solution RE: Which plaintext languages to try? - Linda - 26-01-2025 Might also want to consider this as well: Sabir, a fifteenth century proto-pidgin. It would be the language of trade. You are not allowed to view links. Register or Login to view. RE: Which plaintext languages to try? - oshfdk - 26-01-2025 (26-01-2025, 02:49 PM)Addsamuels Wrote: You are not allowed to view links. Register or Login to view.It is also certain that if the language was a plaintext language, then it would be an archaic or obsolete form of this language or potentially a dialect. The languages I chose, was based on a chart from Dr L F Davis and the provenance of a likely European language. They also form multiple language families too (although they are all indo-European, and the manuscript like is too, if and only if it has a plain text solution I would like to limit the number of languages I initially consider to the absolute minimum. If there is no good result for Latin/German(ish)/French(ey) languages, maybe I'd consider Czech and other languages, but I see no particular reason to include them for now. One of my assumptions is that the weirdness/uniqueness of the manuscript (the script, drawings) is just a feature of this particular work, and has nothing to do with how exotic the underlying language or culture is. So, maybe I should have asked my question about the plaintext language a bit differently: suppose you are teleported to an unknown location in the XV century Europe, this is all they told you before beaming you out. You turn up on a forest road deep in the woods, it's nighttime, there are no landmarks, you can't guess which region you are in, could be northern summer or southern winter. You see a small dark shape on the side of the road, which turns out to be a bag, obviously some traveller lost it, and inside you find what feels like a vellum codex. You take it out, but it's too dark to see the actual writing. Now, you have a minute to guess the language of this codex. If you guess it right, they'll teleport you back, if you guess it wrong you'd have to write a book no-one would be able to decode for centuries (just kidding). Assume this is a fair test of your cognitive skills and not a trick. Of course, the language could be Hungarian or Middle Dutch or even Arabic, but to improve one's chances of getting back in our time, I'd guess one would go with Latin? So, getting back to the Voynich MS, there is a strong case for Latin as the default manuscript language of that time in Europe (as far as I understand), there is a strong case for Germanic languages, because there is something that looks like traces of German in some of the marginalia, and there is a case for Romance languages, because of the month names. Even if the marginalia are not by the author, it's a bit more likely that they were added in the same general region where the manuscript was created. As far as I know, there is no specific reason to include Czech, Arabic, Hungarian, etc., other than to cover all the bases or explain why the manuscript hasn't been decoded yet. I'm not yet in a situation where I need to cover all bases and I assume that it hasn't been decoded yet because of the way it was encoded and not because of the exotic plaintext. So, I'd rather carefully examine languages one by one starting with the most likely and progressing towards less likely. Unless there is anything specific in the manuscript that could link it to Czech, Hungarian, Hebrew, Basque, etc. I'd keep these languages later in the queue. RE: Which plaintext languages to try? - oshfdk - 26-01-2025 (26-01-2025, 04:19 PM)Linda Wrote: You are not allowed to view links. Register or Login to view.Might also want to consider this as well: Sabir, a fifteenth century proto-pidgin. It would be the language of trade. Thank you for the suggestion! I'm not sure pidgins are good candidates for large manuscripts though. As far as I understand, they would be normally used for some limited communication to simplify trade. Quoting the Wikipedia article: "Because it is a pidgin Mediterranean Lingua Franca had a very small vocabulary, this and the fact the language is not well attested means we only have knowledge of a few hundred words in the language." From my point of view the Voynichese does not look like plaintext for any natural language, there are too many short repetitions and too few long repetitions or patterns. To me this looks like a one-to-many cipher (not necessarily verbose). But if it is not a natural language plaintext, but a cipher, then why assume an exotic or rare language plaintext, unless there are some specific reasons for this. |