The Voynich Ninja

Full Version: Identifying function words
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9
(21-09-2017, 06:03 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.
(21-09-2017, 05:22 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.
Quote:Well, that suggestion has been around for a long while. But the words which [q] attaches to suggests not. About 80% of words starting [q] see the character followed by either [ot] or [ok]. Almost none are followed by [ch] or [sh], or even [r]. The word 'and' should attache to a class of words, specifically nouns, so we would have to propose that [t] and [k] were noun markers. This would put us on a much different path from a natural language, thus destroying the goal of finding functions words, surely?

Hi Emma,

The word "and" may well join adjectives, verbs, or even numerals, but I agree that the major portion should be nouns. I'm not sure why you say that t and k would be noun markers. Would not o be a noun marker in this case? 

I'd say that the task of finding function words extends beyond the natural language approach. It would suit artificial language, and maybe some not very sophisticated ciphers.

Were [o] a noun marker then there would be a high number of words starting [o] which don't take [q] frequently. For example, many words starting [ol] occur but have low rates of [q]. Furthermore, we would still be left with a problem as [ch, sh] don't even take [o] often.

The idea that [q] is a grammatical marker stems from the fact that it appears in the body text and not the labels. Yet I think this is actually a misunderstanding. It seems to not appear in diagrams, including running text which is part of those diagrams. For example, f70r2 has four lines of text running round the outside of the diagram, and there is not one [q] in it (there are instance of [q] in the block alongside).

Curious. But it does in fact occasionally appear in the running text around diagrams, as in these example (including one just two pages earlier f69v);

You are not allowed to view links. Register or Login to view.

You are not allowed to view links. Register or Login to view.

You are not allowed to view links. Register or Login to view.

You are not allowed to view links. Register or Login to view.

But I agree that it does seem rarer in those running texts.

Curiously, I notice that in the last one, f67r1, the example in the running text around the diagram is tantalisingly similar to the example in the text above the diagram

Running text:
You are not allowed to view links. Register or Login to view.

Could be read as: qoikeey ?

Text above diagram:
You are not allowed to view links. Register or Login to view.

Could also be read as: qoikeey ?
(21-09-2017, 06:46 PM)Stephen.Bax Wrote: You are not allowed to view links. Register or Login to view....

Curiously, I notice that in the last one, f67r1, the example in the running text around the diagram is tantalisingly similar to the example in the text above the diagram

Running text:
You are not allowed to view links. Register or Login to view.

Could be read as: qoikeey ?

Text above diagram:
You are not allowed to view links. Register or Login to view.

Could also be read as: qoikeey ?


Since I created my own transcript, I've spent a great deal of time staring at the VMS glyphs and trying to work out the ones that are ambiguous (both by shape, context, and frequency), and my feeling is that the first one you linked (You are not allowed to view links. Register or Login to view.) might be qoikeey (the "i" shape does occur directly before gallows characters, both benched and unbenched) but the second one (You are not allowed to view links. Register or Login to view.) I'm fairly sure is written as quekeey. Of course, what was written and what was intended are not always the same thing.
My readings would be qoikeey and qoekeey, respectively
It is worth noting that qo also occurs as a separate vord (29 matches), which can precede other vords starting with qo-.
(21-09-2017, 06:23 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.
Quote:Were [o] a noun marker then there would be a high number of words starting [o] which don't take [q] frequently. For example, many words starting [ol] occur but have low rates of [q]. Furthermore, we would still be left with a problem as [ch, sh] don't even take [o] often.

Not sure if I understand your point correctly. Do you mean, e.g., that we have olor, but do not have qolor?

While qo- is the overwhelming prefix containing q (count of 5289), next to that are qe and qc (66 and 23, respectively). So maybe that's a set of function elements, with qo- being the most predominant of them.

Anyway, I definitely like that idea about q. The point about it is that not only it is infrequent in labels, but also that it is much more frequent in Currier B. But that might be explained by the nature of the texts of biological and recipe sections which just might require "and" quite often. I wonder whether there is any difference in q frequency between botanical A and botanical B.

Yes, my point is that you can't have words starting [q] without the same words starting [o] (that is, same minus the [q]), but you can have words starting [o] without high numbers of [q] counterparts. The dependency is one way and conditioned significantly by both [o] and the character following [o]. The ratios of [q], [o], and plain forms are broadly predictable.

While [qe] and [qc] are interesting insights into how [q] works, they are such a tiny fraction (1.7%) that I don't think they represent a set of functions based on [q].

I'm not really keen on the idea of [q] as 'and', partly because it's just speculation presently quite seriously. There's a lot we don't understand about the character so it seems premature to assign it meaning. I'm not sure who first proposed it but I wish it was put to rest.
One of the ideas I've been bouncing around for a while - but have made 0 progress on- is the idea of trying to mayo Voynichese onto existing grammar.
Even if it is a highly abbreviated abjad, function words will be in place to denote case, identity and tense.
So - can we take several different language independent yet common grammatical structures, postulate what a highly abbreviated abjad version of them would look like, and then try to map Voynichese on top?
Grammatical structures can be quite varied and require us to analyze longer chunks of text. It's also harder to know what success would look like. Even if we managed to model the grammar of the text how would we prove that we had the right solution? Knowing the grammar alone would still need a further step.

But your way of thinking is quite right. We can map language universals onto the text and gain some kind of insight. This is what I've been doing looking at the structure of words trying to find out how the apparent structure reflects sounds. The great thing is that we can analyze words alone, have a much tighter understanding of the whole sound system, and the outcome is, potentially, a readable text. If we learnt enough about the different characters, how they're structured, how they interact, and what sounds they can and cannot be, the whole of the text will be open to us and we can simply say 'yes, this is a known language', or 'no, this is gibberish'.

The really, really, great thing is that each step can be built without reference to any particular language. We supply a sound linguistic argument for each conclusion and at no point waste time trying to fit the text to our preferred solution. (Well, excepting of course the assumption of a linguistic solution!)
(21-09-2017, 09:33 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Grammatical structures can be quite varied and require us to analyze longer chunks of text. It's also harder to know what success would look like. Even if we managed to model the grammar of the text how would we prove that we had the right solution? Knowing the grammar alone would still need a further step.

But your way of thinking is quite right. We can map language universals onto the text and gain some kind of insight. This is what I've been doing looking at the structure of words trying to find out how the apparent structure reflects sounds. The great thing is that we can analyze words alone, have a much tighter understanding of the whole sound system, and the outcome is, potentially, a readable text. If we learnt enough about the different characters, how they're structured, how they interact, and what sounds they can and cannot be, the whole of the text will be open to us and we can simply say 'yes, this is a known language', or 'no, this is gibberish'.

The really, really, great thing is that each step can be built without reference to any particular language. We supply a sound linguistic argument for each conclusion and at no point waste time trying to fit the text to our preferred solution. (Well, excepting of course the assumption of a linguistic solution!)

Has this approach ever been successful in decoding any previously unknown language or script?
I don't think that's a relevant question unless we can also show that it's been tried and known to fail.

So, has this approach even been tried and shown to fail in decoding any previously unknown language or script?
(22-09-2017, 01:17 AM)Stephen.Bax Wrote: You are not allowed to view links. Register or Login to view.
(21-09-2017, 09:33 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Grammatical structures can be quite varied and require us to analyze longer chunks of text. It's also harder to know what success would look like. Even if we managed to model the grammar of the text how would we prove that we had the right solution? Knowing the grammar alone would still need a further step.

But your way of thinking is quite right. We can map language universals onto the text and gain some kind of insight. This is what I've been doing looking at the structure of words trying to find out how the apparent structure reflects sounds. The great thing is that we can analyze words alone, have a much tighter understanding of the whole sound system, and the outcome is, potentially, a readable text. If we learnt enough about the different characters, how they're structured, how they interact, and what sounds they can and cannot be, the whole of the text will be open to us and we can simply say 'yes, this is a known language', or 'no, this is gibberish'.

The really, really, great thing is that each step can be built without reference to any particular language. We supply a sound linguistic argument for each conclusion and at no point waste time trying to fit the text to our preferred solution. (Well, excepting of course the assumption of a linguistic solution!)

Has this approach ever been successful in decoding any previously unknown language or script?

To understand the nature of the script was always an important step in the deciphering of unknown scripts. It is simply not possible to decipher a syllabic script with the idea that it is  an alphabetic script or some kind of "picture writing".

Another important step is to understand the relation between words. For instance the deciphering of the cuneiform was possible since Grotefend noticed a recurring pattern in the cuneiform signs. This way it was possible for him to identify the word for king in the sentences "Xerxes, great king, king of kings, son of Darius, king of kings" and "Darius, great king, king of kings, son of Hystaspes". With ancient written records it was then possible for him to identify the names of the kings as "Xerxes", "Darius" and "Hystaspes".
Pages: 1 2 3 4 5 6 7 8 9