(14-07-2021, 03:14 AM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.Hi Torsten and Marco, thank you for illuminating some of the paper to people like me who are statistically challenged. And of course to the researchers: this paper took time and effort and is appreciated!
Can I ask what might seem an obviously answered question, because I frankly don't understand 90% of the study, and it's a question I had even prior to it. But I am not a linguist at all.
Is it possible that there actually are two dialects being encoded here? Latin, for instance, seems to have been modified by every language group in Europe. Doesn't the difference between A and B, for instance, seem to argue for encoding two different original works (or written by different dialect-speaking scribes) in closely related 'dialects'? And could linguists perhaps derive some clues from the very frequent 89, or eva dy, in B as opposed to A, that seems to be a common ending in one but not the other?
As an aside but continuing thus subject, I do think I remember reading as well that the 40 construction in the VMS is much more frequent in some sections than others. Some observors have thought this might translate to "qu". If so, might the difference between the two " dialects" be the difference between classical Latin and slightly more vulgar Latin that was using quod, quia, etc for clauses very frequently. Both Latins were used at the same time.
I don't want to isolate Latin, btw, just using it as an example.
Hi Barbrey,
I am not a linguist either, but here is my opinion for what it's worth.
Looking at 40 (EVA:qo) and 89 (EVA:dy) as corresponding to bigrams in the source language implies that one is looking for a simple substitution cipher (where 4=q, 0=u, maybe -9=-us etc). But character entropy shows that Voynichese cannot be a simple substitution cipher of an ordinary European language: the way in which characters follow each other in Voynichese is too rigid to match any of those languages.
40 also has the feature that removing 4- from a 4-word typically results in a legal 0-word: if you remove the initial q- from a Latin q-word you almost never get a legal word. Finally, the qo- and o- variants often appear consecutively as qoX.oX (Zandbergen-Landini EVA transliteration):
<f40r.2,+P0>
qokar.okar.okedy.dar.<->ykchey.kaiin.ok[a:o]s,chedy.okar.a,ralos
<f78r.15,+P0> dchckhedy.qokchdy.
qokedy.okedy.dal,or.okeed.olkain
<f103v.4,+P0> y,cheey.
qokeey.okeey.lkees,ol.qoteedy.ykeedy<$>
<f112r.13,+P0> sor,aiin.chdy.ches.
qokeey.okeey.otaiin.chcthy.oteey,dy
or as oX.qoX
<f31r.10,+P0> <%>tol,shso.okedy.
okedy.qokedy.qokeedy.dar.shedshey
<f79v.13,+P0> dain.ar.olshey.dytain.qokain.checthy.
okeedy.qokeedy.ror
<f102r2.11,+P0> kockhas.okor.ykeey.
okeey.qokeey.dol.ol.sheody.okey.da,l,{cthhh}y
<f107v.38,+P0> dain.
okchey.qokchey.qokaiin.olkeey.qokol.oteey.oteey.lkain
This is a special case of a phenomenon that is typical of Voynichese: similar words tend to appear consecutively. This has been addressed by Timm and Schinner and (from a different angle) by Rene with his You are not allowed to view links.
Register or
Login to view..
The relationship between qo- and o- has also been discussed by Emma May Smith (You are not allowed to view links.
Register or
Login to view. and following posts).
Instead of thinking of a simple substitution, it is better to focus on the weaker assumption that each Voynichese word corresponds to exactly one word in the underlying language: this can be accomplished with a nomenclator, but the point really is to consider words as the atomic element to be analyzed. This is what has been done in the paper discussed in this thread.
Here one is faced with the problem of function words: the most frequent words tend to be more or less the same for all texts in a given language. Even different but related languages can have a considerable overlap in function words. But in the Voynich manuscript each section has a distinctive set of top-10 words.
See for instance this table of word types sorted by decreasing frequency (originally posted You are not allowed to view links.
Register or
Login to view.).
This shows that in Voynich sections the top 30 words vary a lot, with 'daiin' and 'chol' decreasing from left to right (A to B) while 'chedy', 'shedy', 'qokedy' increase.
The two extremes HerbalA and Bio share 9 out of 30 words (smaller blue circles): this looks significant, but the top 4 words in Bio are excluded from the intersection with HerbalA.
Also notice the frequent 'eol' words which are typical of Pharma (marked with orange circles). Though both Pharma and HerbalA are both classified as Currier A, they are noticeably different.
On the other hand, in Latin texts about different subjects and from different times, the top five words are fairly consistent.
In my opinion, we are left with three options:
1. This assumption is wrong and Voynichese words do not correspond to words in an underlying language (Currier and Torsten favour this possibility).
2. Words are written in different ways in the different sections (e.g. Bio 'qokain' corresponds to HerbalA 'sho' - using two random words as an example). Nick Pelling proposed the task of mapping Currier A to Currier B or vice-versa: I think this is a great, though terribly hard, research area.
3. The underlying language has no function words (I am far from sure that such a language exists, but this is my preferred option, though I am also interested in option 2).
Quote:I guess I find it somewhat dismaying that a different code or cipher might have been used throughout the manuscript; I'd rather believe in a slight shift of dialect in the same language! But is what I've said here wishful thinking or a possibility?
A light shift of dialect would not by itself result in totally different top ranking words. At the level of bigrams (couples of consecutive characters) the You are not allowed to view links.
Register or
Login to view. is as large as that between Latin and Italian (two distinct languages).
In my opinion, if one wants to only consider "obvious" European languages with a one-to-one word correspondence, points 3 and 1 above are excluded and one is left with option 2: radically different encoding/spelling for the different sections.
Another observation against a one-to-one correspondence with words in normal European languages is that, in Voynichese, many of the most frequent words can be reduplicated, e.g.
<f5v.3,+P0> qotcho.ytor.
daiin.daiin.otchor.daiin.q'o.darchor.do
<f32v.8,+P0> otchol.
daiin.daiin.ctho,daiin.qotaiin.<->otchy.d.<->shan
<f78r.35,+P0> y.sain.checkhy.qokain.cheeky.
daiin.daiin.y,tees.ol,y
<f115r.17,+P0> qol.cheey.qotchy.
daiin.daiin.cheocthy.dolkeedy.qotaiin.chol.oteeedchey.okeedain
<f76v.23,+P0> dchedy.qokeedy.qotchy.qokol.
shedy.shedy.chedy.olched[?:r].shetey.saiin
<f82r.18,+P0> polched.otain.
shedy.shedy.dal.chedar.qokeey.ykeey.l,s,araiin,ory
<f82v.6,+P0> qokedy.lshedy.qotol.dol,
shedy.shedy,dy.darotedy.chetedy.lokam
<f103v.7,+P0> daiin.shey.chol.chey.oteey.lkeeor.okaiin.
shedy.shedy.qokaiin.ol.chedydy
This is not the case for frequent words in European languages (e.g. 'and and' 'the the' 'of of'...). One can sometimes build sentences with those patterns (
not what I am thinking of, of course) but such examples are extremely rare in actual texts.