(12-04-2021, 04:36 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Marco: I even forgot to mention the gramaticallity of the output. But I wonder if, as a challenge of sorts, if it would be possible to produce a long word-salad without using an interpretative step.
I fear that even this might be very difficult.
Since the VMS only contains about 8,000 different word types, while languages typically have at least 100,000, defining a function that maps from the VMS to the dictionary of any language should be possible. Bowern and Lindemann (The Linguistics of the Voynich Manuscript) came to this conclusion:
Bowern and Lindemann Wrote:the script is not structure-preserving in that the graphemes
are not one to one, but they do encode words in a regular orthography
In their opinion, the mapping cannot be based on the structure shown by Voynichese glyphs. A drastic example of a system that does not preserve such word-structure is Rene's "mod2" nomenclator. A less drastic case is the "anagrammed Hebrew abjad" proposed by Hauer and Kondrak.
I am not sure that a word-structure-preserving mapping is impossible for something like Chinese or Vietnamese: I don't think that Bowern and Lindemann deeply explored these options.
Anyway, it is important to acknowledge that ancient manuscripts did not encode word salads, but languages with well defined properties.
For instance, one can consider the English You are not allowed to view links.
Register or
Login to view. (a 1410 ca copy of The Canterbury Tales). An excellent transcription is available You are not allowed to view links.
Register or
Login to view..
One can compare the most frequent 20 words in the manuscript with the 20 most frequent words in You are not allowed to view links.
Register or
Login to view.. 15 of the 20 most frequent manuscript words appear in the top 20 modern words, either identically (green) or with minimal variations (upper-case initial, or þ for 'th').
[
attachment=5448]
Though my knowledge of English is limited, I find this manuscript very accessible. Of course, getting used to the script requires a little initial effort, but then things work quite well. E.g.
[
attachment=5446]
Whan that aprille witħ his schowres swoote
The drougħt of Marche haþ perced to þe roote
And bathud euery veyne in swich licour
Of which vertue engendred is þe flour
Whan zephirus eek with his swete breeth
Enspirud hatħ in euery holte and heetħ
The tendre croppes and þe ȝonge sonne
Hath in þe Ram his halfe cours I ronne
The fact that 'þe' is equivalent to 'the' is quite obvious, since the word is so frequent. Once you understand this, it's easy to see that (like in modern English) the article appears at the start of noun phrases and is typically followed by either a noun or an adjective (þe roote, þe flour, þe ȝonge sonne, þe Ram). Like in modern English, 'his' behaves similarly to 'the' (his schowres swoote, his swete breeth, his halfe cours). Like in modern English, 'and' connects two grammatical structures of the same kind (e.g. sentences or noun phrases). You can basically start from function words, which are almost identical to modern English, and work from there.
I can have trouble with the meaning of words like 'schowres', 'holte' or 'heetħ', but identifying part-of-speech categories is almost always straightforward, so it's easy to keep track of grammar, even when some of the meaning is lost.
The true Voynich translation will allow us to do just that: start from function words, identify basic grammatical structures and finally get to word meanings and translation.
The fantasy that language structure is a modern invention cannot possibly lead to anything interesting.