The Voynich Ninja

Full Version: Glyph counts between Gallows
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
I looked at the count of glyphs that appear between the gallows glyphs, benched and un-benched, on all lines of the text, and ignoring spaces between words. 

One goal was to see whether the counts supported the idea that the benched gallows are just another way of writing their un-benched versions: the evidence I found doesn't strongly support that. 

Another goal was to see whether there are differences in the distributions of number of glyphs following EVA p, k, f and t. It turns out that statistically there is a difference: EVA t , k tend to be followed by 5 glyphs before the next gallows is written, and EVA f, p tend to be followed by 6 or 7 glyphs. For the benched gallows, the statistics are poorer and less compelling.

Is this difference between counts for EVA t, k  and EVA f, p somehow related to the extra flourish that EVA f, p have - I call it the "curlicrossbar" (what is the correct terminology?)? 

Another feature that is revealed is the perhaps well known paucity of occurrences of two gallows next to one another.

Here's the post about it: You are not allowed to view links. Register or Login to view.. 

Your thoughts on this and suggestions for further investigations along these lines would be most welcome!
Posted on your site a couple of avenues to follow for potential explanations. Let us know if any bring a solution or if the problem remains.
Hi Julian, great to see you posting again!

Did you use unmodified Eva? Do you think certain different parcing choices would impact the results? For example ch, sh, iin... could be meant as single glyphs.

Obviously the overall numbers would be lower if things like the bench are treated as one glyph instead of two. But might this also change the relation between the different gallows?
Hi Julian. I wasn't involved with the VMs when you were last active, but I've read your whole blog through a couple of times and am a big fan of your work. Let's just say I have a very good feeling about that recent paper by Rene Zandbergen. I'm not much one for cargo cults, but it seems like this paper is attracting a number of serious veteran VMs researchers back to the discussion table. I'm half expecting Prof Jorge Stolfi to show up and comment on it. If he does, I'd feel comfortable saying a breakthrough is not far away.

I lied. I love a good cargo cult. If everybody could help me out by saying his name in their comments, it makes the mojo stronger, and he's more likely to find this thread and show up, by the power of SEO.  Dodgy

julian's blog Wrote:[font=Verdana, 'BitStream vera Sans', Helvetica, sans-serif]The number of glyphs that follow a benched Gallows is typically 3, before the next Gallows is encountered. For un-benched Gallows, this number is typically 5 to 7[font=-apple-system, BlinkMacSystemFont]. This does not enthusiastically support the hypothesis that the benched Gallows are simply the same as un-benched gallows with the two adjoining glyphs written separately. For example, take a common sequence, with 7 intervening glyphs:[/font][/font]
pxxxxxxxp

is clearly not equivalent to the common sequence for benched of:

PxxxP

since PxxxP written as un-benched would be

cphxxxcph
[font=Verdana, 'BitStream vera Sans', Helvetica, sans-serif]which only has 5 intervening glyphs.[/font]


First of all, my working definition of unbenching: Finding the ngram that's likely a benchless equivalent of a benched gallows glyph. By way of analogy, if my goal was to make a German text diacritic-free, I would change every occurrence of Bär to Baer.

I think there are different possible ways unbenching could be done, that may be a bit more promising. These days, in reference to Koen's comment, I'm inclined to treat [EVA=ch] as a single glyph. In fact, I'll go out on a limb and propose that if it weren't for benched gallows, [EVA=ch] being one inseparable entity would be a far less controversial idea. Therefore, I'm inclined to explore a method of unbenching which transforms cGh to Gch. I'm planning on running an experiment to test this possibility soon, using the statistics from voynichese.com. My working hypothesis is that if cGh is the equivalent of Gch, then the count ratios of cKh:Kch, cTh:Tch, cPhTonguech, and cFh:Fch should be roughly equal.

Results:
cKh 907 : Kch 1093 = 0.8298
cTh 945 : Tch 992 = 0.9526
cPh 217 : Pch 752 = 0.2886
cFh 74 : Fch 193 = 0.3834

Hmm. Not quite as promising as I'd hoped. But I'm still intrigued by the fact that both of the two-legged gallows give a similar ratio to each other, as do both of the one-legged gallows.

Of course there's the old chestnut of why one-legged gallows prefer top lines of paragraphs so strongly. The idea I've been most given to entertaining recently is Nick Pelling's idea that P = Te and F = Ke. And then there's Brian Cham and Patrick Feaster's independent observation that a vord's curve glyphs (ch, sh, ee, and e) almost always occur together in the first half of a vord, with no glyphs interrupting the series except maybe a gallows. Connected curves (ch or sh) in a vord's series almost always precede unconnected curves (ee and e). A gallows, which occurs in surprisingly close to exactly half of all vords, typically either precedes the entire series of curves, or interrupt it near its beginning, where the connected curves are. This "interruption" can take two forms: either between two ch's (or a ch and a sh), or benching one of the ch's.

Taking all this into account, I'll stand on a very skinny limb and propose that cPh = Tche, and cFh = Kche. I'll crunch these numbers later and see how they compare to the figures above.

Another area for improvement is to exclude Grove words from all the counts. I don't think the gallows that make Grove words have anything to do with the gallows that occur anywhere else in the manuscript.
(30-07-2021, 09:23 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Hi Julian, great to see you posting again!

Did you use unmodified Eva? Do you think certain different parcing choices would impact the results? For example ch, sh, iin... could be meant as single glyphs.

Obviously the overall numbers would be lower if things like the bench are treated as one glyph instead of two. But might this also change the relation between the different gallows?

Hi Koen: thanks. I used Takeshi. It's true that the counts would go down if some of those glyph sequences you mention are treated as single glyphs, and it's possible that their probability of occurring is different depending on the preceding gallows. I haven't looked at that, preferring to leave the decision on what is a different glyph to Takeshi-san ( a cop-out, I know).
(30-07-2021, 11:31 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.Hi Julian. I wasn't involved with the VMs when you were last active, but I've read your whole blog through a couple of times and am a big fan of your work. Let's just say I have a very good feeling about that recent paper by Rene Zandbergen. I'm not much one for cargo cults, but it seems like this paper is attracting a number of serious veteran VMs researchers back to the discussion table. I'm half expecting Prof Jorge Stolfi to show up and comment on it. If he does, I'd feel comfortable saying a breakthrough is not far away.

I lied. I love a good cargo cult. If everybody could help me out by saying his name in their comments, it makes the mojo stronger, and he's more likely to find this thread and show up, by the power of SEO.  Dodgy

(Interesting analysis clipped)

Another area for improvement is to exclude Grove words from all the counts. I don't think the gallows that make Grove words have anything to do with the gallows that occur anywhere else in the manuscript.

Hi RenegadeHealer,

Thanks for this very interesting reply, and for mentioning STOLFI! 

As mentioned in a comment to Emma on the blog, the motivation for looking at counts of intervening glyphs between gallows was the pipe-dream that the gallows' appearances are somehow connected to the positions of a cipher wheel as it is rotated to encode a new plaintext word. So, Rene's arXiv paper about Rugg's grilles and cipher wheels resonates strongly with me - it's very exciting! 

Emma also suggested removing Grove words (this refers to words which begin with a gallows glyph and are at the start of a page, and which were first identified as curious by a guy called Grove, if I'm not mistaken?).

It may be that the gallows glyphs in the Grove words are related to (or maybe even define) the starting position(s) of the cipher wheel(s) that are going to be used for encoding that folio.

Rene has shown that the grille/wheel method, with carefully positioned glyphs on the wheel, can produce VMS words that look right i.e. have the construction identified by Stolfi. The big question in my mind is how the apparatus would actually be used to cipher a Latin plaintext :-)
Hi Julian,
I am trying to reproduce your results as a first step to further explore the interesting observations you made.

I am getting slightly higher counts than you. This is what I meant to do (as always, I may have made errors):
  • I processed You are not allowed to view links. Register or Login to view..
  • I kept all lines (labels included).
  • Lines were not merged.
  • I treated the unreadable character '?' as an ordinary character.
  • I counted overlapping matches (kalshedykchedychT results in two counts for k:7).

The maximum values I get are: f:6(53) p:6(168) k:5(1352) t:5(757)

I attach the linux scripts I used and the output csv.
(30-07-2021, 06:16 PM)julian Wrote: You are not allowed to view links. Register or Login to view.One goal was to see whether the counts supported the idea that the benched gallows are just another way of writing their un-benched versions: the evidence I found doesn't strongly support that.

I've tried several ways of moving the gallows (unbenching being a particular case) then looking for improvement on word pair statistics. No good result so far.

Quote:Another goal was to see whether there are differences in the distributions of number of glyphs following EVA p, k, f and t. It turns out that statistically there is a difference: EVA t , k tend to be followed by 5 glyphs before the next gallows is written, and EVA f, p tend to be followed by 6 or 7 glyphs.

That's probably because words are longer (or p/f are inserted) on the first lines of paragraphs.
(31-07-2021, 11:53 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.
julian Wrote:Another goal was to see whether there are differences in the distributions of number of glyphs following EVA p, k, f and t. It turns out that statistically there is a difference: EVA t , k tend to be followed by 5 glyphs before the next gallows is written, and EVA f, p tend to be followed by 6 or 7 glyphs.

That's probably because words are longer (or p/f are inserted) on the first lines of paragraphs.

It seems to me that p/f words are longer because they include a bench much more often that k/t words. If benches are represented as single characters (e.g. C and S) instead of two characters (ch and sh), the next-gallows-distance has a maximum at 5 characters for all gallows.

EDIT: if these figures are correct, they confirm Koen's idea about the impact of parsing.
(31-07-2021, 12:13 AM)Renegade Healer Wrote: You are not allowed to view links. Register or Login to view.Another area for improvement is to exclude Grove words from all the counts. I don't think the gallows that make Grove words have anything to do with the gallows that occur anywhere else in the manuscript.



Hi, Julian:

It's great to see you with a new posting.  I continue to mull over what your results could mean.

One thing I would be curious about is whether there is a difference in your numbers between Currier A and Currier B.  Of course, breaking it up could reduce the numbers such that the results are less reliable.  Apologize if Marco did this -- I have not reviewed his data yet.

Also, in a more general sense -- do the distance between occurrence of any letter in -- say English or Latin -- show the same sort of peak around 3 or 5 with a long tail?  

I assume this would be connected to frequency.  If so, what frequency in English or Latin is needed to exhibit a similar behavior? 

Or is this gallows glyph behavior one of the "non-language" characteristics that is not mimicked by any particular single letters in, say -- English or Latin?  Is this kind of distance measurement so similar to secondary entropy that it is particular to Voynichese (or other languages with low entropy --- etc., etc. -- no need to test)? 

As for Grove words -- my understanding of their definition is that it is all words that have a gallows glyph as the first(?) glyph and if that glyph is removed, a "valid" (e.g. seen elsewhere in the manuscript) word remains.  I don't think word position in the paragraph is involved, but I could be wrong.

I agree it would be interesting to see if removing such examples of gallows use from your numbers how the graphs are changed.

The theory is that these kinds of gallows glyphs are more likely to serve some other function than "the same" substitution proposed by that same gallows glyph use in other word environments.  This is because in "non-Grove" words the glyph use is what turns the word into a "valid" word and therefore is more likely to be representing some letter or group of letters in the plaintext.

But "Grove word" gallows glyphs have a theoretical greater chance of having some sort of non-letter function (paragraph signal/topic signal/item signal/punctuation-like) or maybe a signal to interpret the substitution going forward in some particular way?

But this is just my impression from conversations on the board and elsewhere and I could have misinterpreted.

Thanks for considering doing some of these additional analyses and I will be reviewing your and Marco's data with great interest.

Michelle
Pages: 1 2 3 4