The Voynich Ninja

Full Version: Identifying function words
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9
[quote pid='17045' dateline='1506000726']
7v: odaiin and okchor - lines 7 and 8
[/quote]
I think I have seen theese vords only on the first 20 or so pages more than a dozen times.
odaiin is also in line 5 of this folio contained in a longer word. Also on line 4 of 7r...

I cannot explain why they are on the stars but for me it seems if they are not names of stars but something else. Perhaps some ordinal numbers: first star, second star???
Hi undekagon and welcome to the Forum!

odaiin is the second most frequent Voynich star, with 61 exact match. okchor is also quite frequent (19 exact matches). Whether the vords used as star labels are star names or not, remains an open question. If Voynich stars are stars indeed (which is suggested by the "brightness" convention), then I strongly suspect that they are not. However, that would not make the objects heterogenous.
(21-09-2017, 02:49 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.This is something that can be easily done with Job's tool, at least for exact matches, just one needs a bit of patience to add 28 (non-unique) stars into the rendering.

I just did the same for six You are not allowed to view links. Register or Login to view. labels (actually, only four of them are not unique), and there are no adjacent occurrences (for exact matches).

I can see  otol there again as the second label? Is this also a very comon Vord? It is there where the fire is coming out. (Maybe the five labels are the five elements on You are not allowed to view links. Register or Login to view. and the the star with the same label is a very hot star (???))

And I still struggle with your defintion of homogenity? What do you exatly mean by homogenous in context? Equal occurence over the whole manuscript versus clustering? I am not sure I understand what you are computing/showing here. Can you please explain it again?
otol is the most frequent Voynich star. You may wish to read my old blog post about Voynich stars that I already mentioned above: You are not allowed to view links. Register or Login to view.

Quote:And I still struggle with your defintion of homogenity? What do you exatly mean by homogenous in context? Equal occurence over the whole manuscript versus clustering? I am not sure I understand what you are computing/showing here. Can you please explain it again?

When I speak of context here, I mean visual context - the context of imagery, not the context of text. Like, when an image provides a common visual theme for several seemingly equal rank graphical objects (which are labeled). Then we can consider notions represented by those labels likely homogenous in a certain respect. For example, f68r1 and r2. There is the single visual context (allegedly, star maps, but that's not that important what it is) for the set of individual "equal rank" star-looking objects.
Just mapped all the Voynich stars (exact matches), - actually not that time consuming! Here's the query:

Code:
http://www.voynichese.com/#/exa:otcheody:ivory/exa:chocphy:wheat/exa:octhey:khaki/exa:ytchody:gray/exa:otshey:charcoal/exa:otydy:navy-blue/exa:okeor:royal-blue/exa:olor:medium-blue/exa:ockhy:azure/exa:otchdo:cyan/exa:otol:teal/exa:otor:forest-green/exa:ykchdy:olive/exa:ocphy:chartreuse/exa:okoldy:golden/exa:okshor:goldenrod/exa:otochedy:coral/exa:dolchedy:hot-pink/exa:cheorol:fuchsia/exa:odaiin:puce/exa:shdar:mauve/exa:dchol:plum/exa:okchor:indigo/exa:okcheody:crimson/exa:cholar:maroon/exa:chodar:mauve/exa:shchy:beige/exa:/1317

Nothing especially promising. No rows of three or more. A couple of adjacent pairs. Only two more cases where two stars are separated by a single vord:

otol chedy olor (75r)

olor okchd dchol (79r)

okchd is a rare vord, so it's an outsider. chedy is frequent (count of 501) which stands for some 1,3% frequency. Choosing between daiin and chedy, I'd vote for the latter, because it's virtually never occurring at the end of the folio (Job lists one occurrence only - f46v). But sequential repetitions of chedy do exist, so probably it's not a good option neither.

But of course much more study is required, these are only some screening results to illustrate the approach.
Quote:Well, that suggestion has been around for a long while. But the words which [q] attaches to suggests not. About 80% of words starting [q] see the character followed by either [ot] or [ok]. Almost none are followed by [ch] or [sh], or even [r]. The word 'and' should attache to a class of words, specifically nouns, so we would have to propose that [t] and [k] were noun markers. This would put us on a much different path from a natural language, thus destroying the goal of finding functions words, surely?

Hi Emma,

The word "and" may well join adjectives, verbs, or even numerals, but I agree that the major portion should be nouns. I'm not sure why you say that t and k would be noun markers. Would not o be a noun marker in this case? 

I'd say that the task of finding function words extends beyond the natural language approach. It would suit artificial language, and maybe some not very sophisticated ciphers.
(21-09-2017, 02:32 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Another exception is 55r, where the two stars are separated by one vord, and that vord is daiin. I think that daiin is unlikely to stand for "and", simply because it occurs as the last vord of a folio, and also there are cases where daiin is repeated sequentially.

I am not sure reduplication is a problem.

I have seen examples of the Romani conjunction thaj been reduplicated thaj thaj. I believe the reduplicated form could mean "and also" but I have been unable to find confirmation. This was discussed You are not allowed to view links. Register or Login to view..

The Latin conjunction atque is made of the sequence of two conjunctions at+que. Of course, this is not reduplication, but it hints to the possible meaningfulness of conjunction reduplication.

daiin is the most frequent word in Voynichese. It is not far-fetched to consider that it could represent two or more homographs. For instance, the Italian short word "di" has at least three completely independent meanings (a preposition meaning 'of', the imperative of "dire"='speak', an archaic term for 'day').
More about nouns. Is the frequency of nouns more or less consistent across different languages? If yes, we could just count supposed nouns in the VMS (say, all vords beginning with o and qo) and compare.
(21-09-2017, 05:22 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.
Quote:Well, that suggestion has been around for a long while. But the words which [q] attaches to suggests not. About 80% of words starting [q] see the character followed by either [ot] or [ok]. Almost none are followed by [ch] or [sh], or even [r]. The word 'and' should attache to a class of words, specifically nouns, so we would have to propose that [t] and [k] were noun markers. This would put us on a much different path from a natural language, thus destroying the goal of finding functions words, surely?

Hi Emma,

The word "and" may well join adjectives, verbs, or even numerals, but I agree that the major portion should be nouns. I'm not sure why you say that t and k would be noun markers. Would not o be a noun marker in this case? 

I'd say that the task of finding function words extends beyond the natural language approach. It would suit artificial language, and maybe some not very sophisticated ciphers.

Were [o] a noun marker then there would be a high number of words starting [o] which don't take [q] frequently. For example, many words starting [ol] occur but have low rates of [q]. Furthermore, we would still be left with a problem as [ch, sh] don't even take [o] often.

The idea that [q] is a grammatical marker stems from the fact that it appears in the body text and not the labels. Yet I think this is actually a misunderstanding. It seems to not appear in diagrams, including running text which is part of those diagrams. For example, f70r2 has four lines of text running round the outside of the diagram, and there is not one [q] in it (there are instance of [q] in the block alongside).
Quote:Were [o] a noun marker then there would be a high number of words starting [o] which don't take [q] frequently. For example, many words starting [ol] occur but have low rates of [q]. Furthermore, we would still be left with a problem as [ch, sh] don't even take [o] often.

Not sure if I understand your point correctly. Do you mean, e.g., that we have olor, but do not have qolor?

While qo- is the overwhelming prefix containing q (count of 5289), next to that are qe and qc (66 and 23, respectively). So maybe that's a set of function elements, with qo- being the most predominant of them.

Anyway, I definitely like that idea about q. The point about it is that not only it is infrequent in labels, but also that it is much more frequent in Currier B. But that might be explained by the nature of the texts of biological and recipe sections which just might require "and" quite often. I wonder whether there is any difference in q frequency between botanical A and botanical B.
Pages: 1 2 3 4 5 6 7 8 9