The Voynich Ninja - Vord paradigm tool

Pages: 1 2 3 4 5 6 7 8 9 10

(14-11-2022, 10:14 PM)Hermes777 Wrote: You are not allowed to view links. Register or Login to view.Here are some lines presented as lists in this manner, selected at random:

47% of Voynich lines start with a non-gallows word: it is quite unlucky that none of the random examples shows such feature.

You are not allowed to view links. Register or Login to view. observed that each of the Gallows / NoGallows and Bench / NoBench classes roughly corresponds to 50% of Voynich tokens and that these classes appear to be independent, i.e. each intersection of these classes (Gallows+NoBench, Gallows+Bench, NoGallows+NoBench, NoGallows+Bench) roughly corresponds to 25% of tokens.

Stolfi Wrote:if we let X(w) stand for the boolean variable `word w has a gallows letter', and Y(w) mean `word w has one or more bench letters', then we find that the variables X and Y have uniform distributions over the text (50% `yes', 50% `no'), and are independent of each other --- even though gallows and benches occur next to each other in Voynichese words.

This certainly is an interesting property, but I don't see how it leads to this "list" model. One could symmetrically say that the list is made of a bench-word followed by 0 or more non-bench words.

e.g.
<f79r.19,+P0> solcheey.chol.sain.oral.shey.qokain.sheyky.shoty.oly

solcheey
chol.sain.oral
shey.qokain
sheyky
shoty.oly

And what of the ~25% of the lines that start with a NoBench-NoGallows word?

E.g.
<f6r.4+P0> dar.chos.sheor.cho{ith}y.otcham.<->yaiir.chy
<f37v.10+P0> soiin.{ch'}ey.okoiin.chey.tom
<f89r1.17+P0> qeaiin.cheyl.seey.qotey.qokeeol.daiin.{ykh}edy.daiin.dam
<f112v.34,+P0> saiin.chedaiin.checkhy.lkeedy.qokeedy.chkaiin.checkhol.chdam

I think it should be made clear which properties of Voynichese are explained by this hypothesis. Voynichese lines are made of Gallows words followed by zero or more NoGallows words, but English is made of E-words followed by zero or more NoE-words:

english is
made of
e-words
followed by
zero or
more
noe-words

Showing similar lists in actual medieval manuscripts, or at least actual books of any kind, would clarify the point being made and would let us check if they exhibit the same statistical properties as Voynichese (e.g. patterns like 'daiin.daiin' or 'qokeedy.qokedy.qoteedy.qod').

(14-11-2022, 10:14 PM)Hermes777 Wrote: You are not allowed to view links. Register or Login to view.Recently, for instance, I encountered a study of word pairs by Mark Fincher: Word Pair Permutation Analysis of natural language samples and it’s value for characterizing the ‘Voynich Manuscript’. It is a study of structures between words and Fincher concludes from it:

‘Voynichese’ is not a natural language in it’s own right. If the VMs text is derived from a plaintext in a natural language, it must have undergone some disruption of word order.

But in fact what the study shows is that - in its word order - the text does not behave like running prose. The "disruption of word order" suspected by Fincher might simply be that the plaintext is a set of lists.

I only skimmed through Fincher's paper, but I don't understand the usage of the word "simply" here. Again, without actual examples of lists from readable text, it is hard to understand the argument, but such lists may very well be non-grammatical. If Voynichese is just a long list of words, one should accept Fincher's conclusion that there is no proper underlying linguistic text and this is not a "simple" step: it's the core of a century of discussions about Voynichese.

(14-11-2022, 10:14 PM)Hermes777 Wrote: You are not allowed to view links. Register or Login to view.This is a further attempt to present lines of Voynich text in meaningful or at least suggestive ways. A slight detour from the matter of Vord Paradigms, but part of the same quest.

Any effort to analyze lines based on formal features like this has the potential to reveal interesting patterns, so I'm continuing to enjoy reading about these experiments. But I join Marco in observing that any number of other features could be used to break lines up into similar chunks. So far you've proposed two criteria for defining breaks within lines: (1) transitions between "vowel"-initial and "consonant"-initial vords and (2) words containing gallows. It's worth noting that these two criteria mostly result in different line divisions. Here's one of the lines you cited earlier, divided up according to both of your criteria by way of example:

1. Division by initial "vowel" versus initial "consonant," with vords containing gallows highlighted:

oees.olkeedy.
qockhy.raiin.chol.
okair.oteedy.
qopchedy.
odaiin.ypchedy.ykam-

2. Division by vords containing gallows, with transitions between initial "vowel" and initial "consonant" highlighted:

oees.
olkeedy.
qockhy.raiin.chol.
okair.
oteedy.
qopchedy.odaiin.
ypchedy.
ykam-

The last three vords end up handled the same way, but the first nine vords of the line end up handled quite differently. The "chunks" identified by the first method don't line up with the "chunks" identified by the second method.

Of course, it's possible that one of the two criteria represents a meaningful break point and the other doesn't. But if we're interested broadly in formal features that seem to be distributed non-arbitrarily within lines (e.g., through clustering, alternation, intrusion, etc.), I suspect that different features will, as a rule, group words differently, into mutually overlapping units, rather than forming consistent, non-overlapping, independent chunks.

If each line were a list, and one type of formal feature (such as a gallows vord) were to mark the start of an entry in the list, it seems that the contents of the entries would still show a lot of patterning *across* entries. I suppose certain kinds of lists could show overlapping groupings and cycles, something like this (which I've modeled after a typical 19th century rural diary):

Tuesday 28, clear and sunny
Wednesday 29, cold, sunny
Thursday 30, cold, rain
Friday 1, cold, rain
Monday 2, cold, rain
Tuesday 3, warm, rain
Wednesday 4, warm but cloudy
Thursday 5, rain
Friday 6, more rain

Still, I'm not sure whether that's a better fit for the rhythms of Voynichese than, say, overlapping categories of case (nominative, accusative, etc.) and part of speech (noun, adjective, etc.).

(13-11-2022, 05:04 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.
(13-11-2022, 03:30 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.If we don't assume a priori what the glyph categories are, would there be any way to infer them reliably from the line patterns?
This is something that Emma and I investigated in our You are not allowed to view links. Register or Login to view.. See in particular p.9 and 10. While we focussed on a specific phenomenon (how the last glyph of words tends to correlate with the first glyph of the following word), it is possible that the issue could be analysed in a more general way, including start-start correlations and end-end correlations. Torsten pointed out You are not allowed to view links. Register or Login to view. that shows an interesting set of tables about this.

The technique used in that Cryptologia paper of comparing actual distributions against the distributions predicted by random shuffling is powerful and can yield extremely interesting results -- as it does there. It can certainly reveal specific preferred and dispreferred pairings: end-start, start-start, end-end, etc.

But I'm not immediately sure how to bring it to bear on what Hermes777 calls "clusters," "alternations," "intrusions," and "bracketing," if those are in fact meaningful distinctions. A "preferred" pairing could be interpreted in multiple ways: a word beginning [Sh] is more likely than average to be followed by a word beginning [q], but that could mean either that [Sh] and [q] are in the same class (and form clusters together) or are in different classes (and frequently alternate or mutually intrude).

Maybe there's a way to attempt this, though. Here are two lines Hermes777 has cited recently:

(1) yees.ykchol.oty.ytor.ytar.ytchor.ytaiin=

(2) shol.chol.shoky.okol.sho.chol.chol.chal-

In (1), [oty] stands out as the only vord in the line that doesn't begin with [y]. In (2), [okol] is the only vord that doesn't begin with [sh] or [ch], which are similar enough for many models to group them together. Both times, the [o]-initial vord appears just to the left of the center of the line.

I guess I could see working out the probability of such lines existing with a random shuffling of vords, and then comparing them against how many such lines actually appear. I've made no effort to check whether they're at all common -- I'm just using this as an example; maybe these are the only two such lines in the whole manuscript. In Hermes777's own analysis, (1) is a "cluster" and (2) is an "intrusion," so from that standpoint, the apparent similarity between these two lines shouldn't turn out to be significant. But the challenge, I suppose, would be to determine whether any patterns of clustering, alternation, intrusion, bracketing, etc. stand out as significantly more common than they "should" be. Clustering is a reasonably expected pattern based on other models. The others, not so much -- so confirming those might be more exciting.

I appreciate all comments, suggestions and critiques. At every step there is an aweful lot to consider. I work through it slowly and methodically, but necessarily proceeding on hunches and clues in the work of others.

Regarding the ‘parsing’ – or chopping up – of lines of text, it could of course be done on almost any basis. Since our text is not a random mash of glyphs but has structure and design, different divisions are likely to reveal patterns of some sort. If you cut up a Persian rug, each piece will retain some suggestion of the design.

But some divisions are surely more warranted than others. To continue the rug analogy, we can cut with the thread or against it, or follow the lines of the design, or ignore them.

We can, as Marco says, chop up English at every occurrence of the letter E. But why would we? What about the language suggests to do this? Why would we think the letter E has some special status and that breaking the text up accordingly might be revealing?

Whereas in Voynichese we have a very conspicuous set of four glyphs that by various criteria stand out from the others. This is clearly by design. English has no such group begging for our attention. Not by design.

If nothing else, the gallows letters stand out because they stand above. Put simply, they are taller than the other glyphs. They are elevated, on legs, on stilts. They stand apart in a vertical decomposition of the text.

I see three distinct levels in the text which can be described in terms of height, elevation or altitude.

Here I apply a mountain metaphor to it as illustrated by these examples:

[attachment=6969]

[attachment=6970]

[attachment=6971]

[attachment=6972]

It is then a natural observation that some vords have peaks and some do not.

In other words, the breaking of the text I made at every vord with a gallows glyph follows – finally - a visual signal in the morphology of the text itself. That is its justification. The tallness of the four gallows glyphs is what privileges them over other signals.

* * *

Dividing lines on the basis of consonant/vowels is also invited by the text. It follows from other studies that suggest there is a default binary pattern running through the text that seems to conform to the familiar consonant/vowel distinction in natural languages. We describe it, in any case, as consonants and vowels, ever aware that this may not be the right binary. Qonsonants and voyels, indeed.

Actually, this suspected consonant/vowel distinction is embodied in the morphology of the glyphs too. The vowels are confined to the ground level. The consonants – aside from the gallows – have benches or ligatures that mark or intrude into the level I have called the plateau.

An exception is [l]. As an aside: The letter [l] is an unhappy choice in EVA. L is a tall letter whereas the gesture of the Voynich glyph [l] specifically denies any elevated extension. It’s gesture, like that of [y], is to go to ground.

Here, to demonstrate, are several lines I used as examples previously. The visual trigger for the breaks are alternations of ground level initial glyphs (vowels) and plateau level (or higher) initial glyphs (consonants.)

[attachment=6975]

[attachment=6974]

We can make distinctions and divisions, perhaps, according to certain horizontal criteria - such as the flow of c-curves and backslashes (minims) in Brian Cham's Curve and Line system - but there are meaningful distinctions to be made from the vertical organisation of the glyphs.

[attachment=6976]
Actually, there are 6 different gallows signs. Unless you think that all writers have the same writing tolerance.

(16-11-2022, 12:15 AM)Aga Tentakulus Wrote: You are not allowed to view links. Register or Login to view.Actually, there are 6 different gallows signs

Please note that Glen Claston also came up with 6 different gallows signs. However, he did not make the above distinction, but considered that that if the flourish of the Eva-p and Eva-f (going to the left) made a turn back to the right, then these are different characters.

This just shows that all of this is subjective.

(16-11-2022, 12:21 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Please note that Glen Claston also came up with 6 different gallows signs.

12 and I'm not counting the pedestalled ones:
EVA f: f u w
EVA k: h W û
EVA p: g j é
EVA t: k ò ô

[attachment=6977]
@Rene
I know what you mean.
When Lisa Fagin Davis presented her work on the different scribes, I presented the possibility that these variants have a meaning and are probably based on the encryption technique of the combination.
So there would be a myriad of possibilities which makes the text much more complex than it looks at first glance.

(16-11-2022, 12:38 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(16-11-2022, 12:21 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Please note that Glen Claston also came up with 6 different gallows signs.

12 and I'm not counting the pedestalled ones:
EVA f: f u w
EVA k: h W û
EVA p: g j é
EVA t: k ò ô

Indeed, and extended Eva has even more, but only six of the ones of GC appear at moderate to high frequency.

Finer distinctions of the gallows glyphs are irrelevant to the argument I presented. The distinction I made was based on the height of glyphs, and by any count the gallows glyphs - by definition - are taller than the others. It makes no difference whether we count 4, 6, 8, 12 or a hundred different variants, they are all taller than the other glyphs, a fact we can measure in millimetres if so required. There is nothing subjective about it. There is no cause to make the glyph set problematic on this criterion and it makes no difference to the vertical distinctions I was making. Have as many different gallows glyphs as you like!

It's not about millimetres, it's about differences like (V + U) + (J + L).

Not wanting to see something is easier than revising it.

Pages: 1 2 3 4 5 6 7 8 9 10