![]() |
labels as words - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: labels as words (/thread-738.html) |
RE: labels as words - -JKP- - 24-08-2017 (23-08-2017, 09:40 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Or does it just mean that initial o favors nouns? Or even 'names'? For a long time I've been wondering if some of the "o" shapes are modifiers or markers. If they were, and IF the rest of the token is a word (in the linguistic sense) then, taking spaces as literal, the VMS vords become even shorter. If you look at a vord like okor, which is both a label and a vord, with initial "o", one sees some interesting characteristics.
So this particular vord stands out because
But then how does one explain kor which can function on its own and appears approximately the same number of times as okor or qokor? If there are biglyphs, it might mean that ok and k represent two different kinds of units (one a biglyph, the other a monoglyph?). This is not uncommon in medieval substitution codes (one-to-many and many-to-one relationships are found together in many of the ciphers documented by Tranchedino) and MIGHT explain a positionally rigid cipher. If you use the same glyphs as both mono- and biglyphs, then you need a space or null or marker OR positional alert in order to distinguish one from the other. *By the way, I haven't read Stolfi, Friedman or Tiltman's take on this, so I don't know if there's any overlap between their ideas and mine or if their writings have any relevance to this thread. RE: labels as words - MarcoP - 24-08-2017 Thank you all for your comments! I am glad you find the graphs useful! (22-08-2017, 09:08 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Now these are interesting and clear statistics, Marco. The number crunching is done by an automatic script, but some manual work is needed to prepare the data and draw the graphs. If a good transcription is available, it's not much work. A language that has no specific suffixes should generate a flat histogram for the main text. If the labels are the usual Greek-Latin names, the label suffix histogram will have the spikes we have already seen for Latin and Italian: the two histograms should be quite different. If the labels are not borrowed, the two histograms could be identical, but of course the best thing would be to see what happens with an actual case. If you can suggest a transcribed text that you think relevant, I will do my best to produce the graphs. (22-08-2017, 09:15 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.There is almost certainly going to be a larger element of borrowing among plant names than the text as a whole. (It should be possible to isolate structurally different words as the most likely loanwords.) I agree, Emma! There is hope that plant names are partly borrowed (and hence potentially easier to recognize). Yet the single-character graphs do not provide information supporting this idea. It is quite possible that if and when names were borrowed, they were “Voynichized” like in Greek->Latin->Italian. kyparissos cupressus cipresso narkissos narcissus narciso huakinthos jacintus giacinto But it's also possible that the level of detail of these last histograms is not sufficient. (23-08-2017, 09:40 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Or does it just mean that initial o favors nouns? Or even 'names'? As You are not allowed to view links. Register or Login to view. commenting this graph: "[ot, ok]: the increase in these should be a result of the lower [qo]." There's plenty of evidence that suggests a relationship between qo- and o-. The most obvious one possibly is that, if you remove the starting q from a word that occurs at least twice, 90% of the times you get an o- word that actually appears in the ms. So, if o- favor nouns, also q- likely does. Emma's observation suggests that the increase in o- might be a consequence of the disappearance of q-. For all I know, this might still be related to nouns, but we currently don't have much evidence to support this idea. Emma is currently researching You are not allowed to view links. Register or Login to view.: we can hope to understand more of this subject in the near future ![]() As I wrote above, the single letter diagrams don't provide much detail with such a small alphabet. But the two letters diagrams highlighted some differences which might prove interesting (e.g. the higher -ry -ly frequencies in labels). RE: labels as words - Koen G - 24-08-2017 Right! I missed Emma's comment but that's certainly the best explanation. So that means that one of the most pertinent questions remains: what is q? Your stats strongly suggest that compared to labels, it is "added" to words in fluent text. In her blog post Emma argues for a sound value for q, in which case the only possibility I see is that it's a sandhi effect. This makes me wonder: can sandhi evoke sounds which lie outside of the language's phoneme inventory? Maybe something like a glottal stop? RE: labels as words - MarcoP - 27-08-2017 I have made a new campaign of histogram stats, trying to find a subset of Voynichese that behaved similarly to Labelese. I only found partial matches, still there are data that I found interesting. I don't a simple solution to discuss, but several minor points that seem hard to explain clearly, but I'll try. The main features of Labelese when compared with average Voynichese are:
Time ago we discussed You are not allowed to view links. Register or Login to view.. Examining the graphs for o- and q-, it is clear that no word ending provides a context that doubles o- (even if -r, -s and -t provide a reduction of q- occurrences close to that observed in labels). I then turned to subsets identified not by the character ending of the previous word, but by the position of the word inside the text. The results I consider interesting are illustrated in the following complex histogram. I will discuss each subset individually, comparing prefiexes with Labelese (the Orange bars). I have used Zandbergen's IVTT ZL transcription instead of Takahashi's, so there are a few very minor differences with the graphs I previously posted.
The suffixes histogram is much more uniform. The clearest spikes correspond to the well known line ending -m. The last word of the first line of a paragraph [Pink] has a large -am spike, but it compares well with Labelese because -ly and -ry are common suffixes both at line end and in Labelese. -in also is a good match between last-paragraph-words and Labelese. These good matches compensate the difference due to the -am spike when computing correlation. Conclusions
RE: labels as words - MarcoP - 02-09-2017 Here is an additional information about qo- (not really new in itself, but possibly new in this form). This is a histogram of prefixes (occurrences of words are counted multiple times): the green bar is the percentage of words staring with that prefix the red bar is the percentage of exactly repeating words About 30% of the exactly repeating words start with qo- (e.g. qokedy.qokedy, qokain.qokain) About 15% of all words start with qo- The stats on unique words and repeating pairs are similar: 20% of unique repeating pairs start with qo- 9% of unique words start with qo- To summarize, this are some peculiarities of qo- (and the corresponding o- behavior):
1 and 2 could be explained by q- needing a "left context" (i.e. an immediately preceding word). 3 evidently must have some other explanation. |