The Voynich Ninja

Full Version: Identifying function words
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9
(20-09-2017, 08:11 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.
(20-09-2017, 06:58 PM)farmerjohn Wrote: You are not allowed to view links. Register or Login to view.daiin is illī, d stands for l sound and aiin for long o (n marks long vowels)
So common ending -dy stands for -ellus, and -ody for -ulus, diminuitive forms often used in medieval latin. Interestingly, -ody is very rarely used in bathing section, so this gives us at least three(!) authors


I think it's very unlikely that such a large proportion of words in a manuscript with several differently themed sections would all end in -ellus.

In fact, since you referenced medieval Latin abbreviations, if it were such, then -dy has many possible interpretations, including -bz (-bus) or -rum or even a category designation as it was done in Hildegard von Bingen's code, but... what is important to remember is that even if the VMS includes Latin-based glyphs or even Latin scribal conventions, it doesn't necessarily mean it's Latin (the conventions could be used to mean something else and were also used to mean other things in other languages even when the same symbols were used).
I think it's all about peculiar form of Latin used in VMS and extensive use of diminuitive suffices is its significant trait. And different authors of manuscript used different suffices. The one who wrote balneo section never used -ulus, only -ellus. That's why words chedy and shedy appear only and mostly in this section respectively. Other authors used these words with suffix -ulus and got sheody and cheody. voynechese.com gives a nice picture of it.
Quote:Definitely not universal as a "word" as Marco points out.

In Latin, for example, "and" is very frequently indicated by the number 7 and attached to the word that follows, and also stands for the letters "et" as in etiam (written 7iam). Thus, the same symbol is used for two different functions and doesn't necessarily ever stand alone because the same convention is used in other languages that may use a word other than "et" for "and", but which still uses the Latin scribal 7 symbol to express it.


In some languages "and" is a single or double character added to the beginning of a word. In English we are used to putting "and" between words, but it's not always done that way, sometimes adding a character to the beginning of the first of the two words that go together represents "and" and once again, it is not a separate word, it is attached to other words in much the same way as a prefix is attached to words rather than standing alone.

Quote:Some languages use a clitic to express 'and'. Arabic, Hebrew, even Latin do this.

OK, but whether a word, clitic, shorthand or otherwise, the portion of the script that represents "and" is presumably to be found in between labels of homogenous objects - unless shuffling is in place, but shuffling is already not a natural flow of language. That's the point with working with the "and" notion: it presents a direct opportunity to find out contexts where it is likely to appear. Offhand, I can't imagine any other opportunities, except, possibly, for "or". Basically, "or" would appear in the same context. That "and" (or "or") may be not a separate word, but instead somehow appended or prepended, just expands the scope - one should look not only for exact label matches, but for partial matches.
(21-09-2017, 01:44 AM)Anton Wrote: You are not allowed to view links. Register or Login to view.OK, but whether a word, clitic, shorthand or otherwise, the portion of the script that represents "and" is presumably to be found in between labels of homogenous objects - unless shuffling is in place, but shuffling is already not a natural flow of language....


Depending on the language, "and" usually appears in these ways...

Jack & Jill
Jack &Jill
& Jack Jill
&Jack Jill

They all mean the same thing, only the rules for where to put the "and" (and whether or not to append it to a word) differ.

If there are other schemes (and it's quite possible that there are), they are in languages unfamiliar to me.


Of course, if it's ciphertext, you can invent any system you want.
You forgot -que  Smile

But these options are very interesting. It means for example that  [q] could be &
Well, that suggestion has been around for a long while. But the words which [q] attaches to suggests not. About 80% of words starting [q] see the character followed by either [ot] or [ok]. Almost none are followed by [ch] or [sh], or even [r]. The word 'and' should attache to a class of words, specifically nouns, so we would have to propose that [t] and [k] were noun markers. This would put us on a much different path from a natural language, thus destroying the goal of finding functions words, surely?
Maybe [t] and [k] are sounds which merge easily with  [q], with [o] in between. It's common for proclitics to be phonetically conditioned and/or have phonetic conequences for the start of the main word.

This only leaves the question what this [qo] looks like when it's not hugging a [t,k] word.
(21-09-2017, 01:44 AM)Anton Wrote: You are not allowed to view links. Register or Login to view.whether a word, clitic, shorthand or otherwise, the portion of the script that represents "and" is presumably to be found in between labels of homogenous objects

It would be interesting to make some experiments along those lines, but how do we define "homogeneity"?

Let's say we want to investigate if common prefixes correspond to "and".
For each prefix p, we could considers words W1,W2 that occur before the prefix and attached to the prefix: W1.pW2
Maybe we should only consider W2 if it is a valid word (that occurs without the p- prefix).

We could then measure, say, the average Levenshtein distance between all W1 / W2
Doing this for several prefixes p1,p2,p3,etc would allow us to see if some generate a significantly lower average distance.

A problem I see is the massive presence of quasi-reduplication.
okchey.okedy.qokchedy.chedy.
pchedar.opchedy.qokedy.opchedy.
ysheod.sheo.sheody.qokeody.qoky.chees.

These look homogeneous:
okedy q-okchedy
opchedy q-okedy
sheody q-okeody
But couldn't quasi-reduplication be different from conjunction?
(20-09-2017, 09:43 AM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.Let us constrain our thoughts to this one page, taken in isolation from the corpus (f104r), and examine how these words function solely on this page, without regard to the corpus.

We take the most common word, chol <chol>:

Beside the 14 instances of [*chol*] also other common glyph groups exists on page f104r. For instance [*ol*] did occur 57 times and [*aiin] 60 times on page f104r. [ol] and [aiin] also occur without any prefix or suffix group. [ol] occurs 5 times and [aiin] 7 times. This is comparable to the 6 instances of [chol] and 4 instances of [chey].

[ol] is also part of [chol]. With other words it is possible to split [chol] into a prefix group [ch] and a suffix or root group [ol]. Moreover [ol] is also used as prefix. See for instance [olcheol], [olcheear], [olkeedy], [olkeechey], [olkeeody], [olkchedy], [olkeechy] ... 

It is far from easy to define what a prefix or suffix or root word is for the VMS.
Quote:It would be interesting to make some experiments along those lines, but how do we define "homogeneity"?

I mean homogeneity in terms of context, not in terms of script. If within a given visual context we observe seemingly equal rank objects, then chances are that they are in a certain way homogenous, and respective labels can then be pursued. Candidate №1 are Voynich stars (68r1 and r2) - not only they are equal rank (apart from the alleged "brightness", but one can even break that down by "brightness" if he so wishes), but not a single label repeats within.

The Voynich pipes (69v) are worse in this respect, because a couple of labels do repeat in the diagram.

Some other offhand suggestions would be Voynich moons of f67r2 (although if I remember correctly, many of those labels are unfortunately unique) or pipe sections in the top of f77r. Nymph nabels in the Zodiac sections can also be considered.

When I wrote about Voynich stars back in 2015, I compiled an Excel sheet (can be found in that blog post) listing all star occurrences in the botanical section, with their line coordinates provided. Unfortunately, only exact matches were considered, and only botanical section was reviewed for coordinates. Anyway, that listing allows me to provide these quick results. Here are all cases where two or more different Voynich stars (exact match) occur in the botanical section in the same line or at least in adjacent lines. (There are cases when one and the same Voynich star repeats, these are excluded from consideration).

7v: odaiin and okchor - lines 7 and 8
8r: dchol and chodar - lines 3 and 4
28r: otol and otor - lines 4 and 5
34r: ykchdy and chodar - line 9
40r: otor and okoldy - lines 3 and 4
50v: otor and odaiin - lines 7 and 8
51r: odaiin and otydy - line 8
51v: ytchody and dchol - line 3
53r: otol and ykchdy - lines 5 and 6
55r: ockhy and otol - line 13
90v1: otol and chodar - line 1

The majority of these occurrences are, however, separated by two or more vords from each other.

One exception is 51v, but here the two stars are directly adjacent, which is of course nice and suggests some sort of listing, but that's the major problem - a listing can deal without "and"!

Another exception is 55r, where the two stars are separated by one vord, and that vord is daiin. I think that daiin is unlikely to stand for "and", simply because it occurs as the last vord of a folio, and also there are cases where daiin is repeated sequentially.

If other sections are reviewed in the same way, perhaps something more will reveal itself. And of course, partial matches can be looked at.
This is something that can be easily done with Job's tool, at least for exact matches, just one needs a bit of patience to add 28 (non-unique) stars into the rendering.

I just did the same for six You are not allowed to view links. Register or Login to view. labels (actually, only four of them are not unique), and there are no adjacent occurrences (for exact matches).
Pages: 1 2 3 4 5 6 7 8 9