The Voynich Ninja
Character entropy of Voynichese - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Character entropy of Voynichese (/thread-148.html)

Pages: 1 2 3 4 5 6 7 8 9 10


RE: Character entropy of Voynichese - Torsten - 16-12-2017

(16-12-2017, 10:39 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Hi Torsten, what I'm speaking of is not gallows characters as such, but the "gallows coverage"- the  phenomenon whereby the gallows' loop covers the subsequent characters. It's discussed in You are not allowed to view links. Register or Login to view. and subsequent posts. I'm not aware of anybody's having systematically investigated that and looking if any rules are exhibited.

This is an embellishment for initial gallow glyphs.


RE: Character entropy of Voynichese - Anton - 17-12-2017

Nope.

It's true that gallows coverage is most often seen in line-initial vords, but:

1) It's not always that the gallows exhibiting coverage is the first character of the vord. See e.g. f30r, 2nd paragraph.
2) There are cases when the initial gallows is drawn in such a way so as to manifest the absence of the coverage. See e.g. f32r, 2nd paragraph with an unnaturally narrow loop of f in fcho.
3) There are cases where gallows coverage is exhibited in non-line initial vords. Loops covering one character can be found quite frequently, but there are also loops covering many characters. See e.g. f87r, line 1.
4) The gallows coverage is also found in labels, at least for t. See f68r1.


RE: Character entropy of Voynichese - -JKP- - 17-12-2017

(16-12-2017, 10:39 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Hi Torsten, what I'm speaking of is not gallows characters as such, but the "gallows coverage"- the  phenomenon whereby the gallows' loop covers the subsequent characters. It's discussed in You are not allowed to view links. Register or Login to view. and subsequent posts. I'm not aware of anybody's having systematically investigated that and looking if any rules are exhibited.

I've been working on it for years and the rewards are small.

The problem with these patterns in the VMS (and there are quite a few of them that I've been systematically working through) is that one can spend weeks, sometimes months, looking into each one and come to the conclusion that a specific variation is probably not meaningful, and then one starts on another and gets the same result.

For example, I spent quite a long time trying to determine if the length of the tail or its angle was meaningful for the EVA-y character and it APPEARS that it is not, but... I'm only about 75% sure.


RE: Character entropy of Voynichese - Anton - 17-12-2017

I'm speaking not of any explanation of the patterns (which would be nice of course), but of systematically describing them, to begin with.


RE: Character entropy of Voynichese - Torsten - 17-12-2017

(17-12-2017, 02:05 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.I'm speaking not of any explanation of the patterns (which would be nice of course), but of systematically describing them, to begin with.

Unique and ambiguous glyphs are just normal for the VMS. On nearly every page one or two weird glyphs exists. See for instance this german page You are not allowed to view links. Register or Login to view..

It is hard to find a pattern behind unique elements. It seems as if the scribe was sometimes testing different design variants for the glyphs used. Sometimes weird glyphs are used to fill gaps. It seems as the glyphs are used for layout reasons. The weird glyphs are one of the reasons for the idea that the glyphs are used to draw an image of a text.


RE: Character entropy of Voynichese - Anton - 17-12-2017

Yes, we have a whole thread about unusual glyphs with many examples by Wladimir: You are not allowed to view links. Register or Login to view.

To find a pattern behind that, one should think over the way the Voynich glyphs are designed and why they are designed in that very manner. There are no definite answers to these questions at present, but some discussion is here: You are not allowed to view links. Register or Login to view.

But the gallows coverage is not something "unusual" from this perspective. Except for rare "irregularities", such as additional loops or dots inside loops, it is, on the contrary, quite usual - the same glyphs just exhibit different reach of the loop. So it makes sense, in the first place, to record their usage and see if there are any patterns. For example, it looks like that (labels apart) multi-character gallows coverage is exhibited in first lines of paragraphs exclusively.


RE: Character entropy of Voynichese - Torsten - 17-12-2017

For the VMS you can find on every level familiar elements that can change into something uncommon and therefore unexpected. That is a pattern for the VMS.

On glyph level there are rarely used variants for which it is unclear if they are new glyphs or variants of commonly used glyphs. On word level there are rarely used words which look quite similar to frequently used ones. For instance the word daiiny occurs only once. On first sight it seems unclear if it is a misspelled variant of the more common word daiin or a new word. Even on page level you can can probably describe for each page something only typical for this particular page. And on language level we have the shift from Currier A to Currier B. 

It seems that the existence of exceptions is a rule for the VMS.


RE: Character entropy of Voynichese - Anton - 03-06-2018

To comment upon Helmut's You are not allowed to view links. Register or Login to view., but not introduce offtopic in that thread.

There are some considerations against Voynichese being just a highly abbreviated plain text (Latin or otherwise), namely from the perspective of its statistical properties.

Let's consider an example with the declaration of the universally neglected human rights. Here's its text in Latin normalized as follows: all headers removed, all characters converted to lower case, all punctuation removed, all "v" characters substituted with "u" (so as to bring the text closer to the medieval practise), and also there are no "j" or "w" characters and no digits.

You are not allowed to view links. Register or Login to view.
The claculated entropies for this sample are: h1 = 3,95, h2 = 3,16 (both of which are notably higher than Voynich).

Let's now introduce some abbreviation to see the direction of how it affects the figures. Let's replace all word-starting "con" 's with "9". There are only 37 occurrences, so the influence will not be dramatic, but we'll observe the direction.

The results are: h1 = 3,96, h2 = 3,18.

Let's add more abbreviation and replace all word-ending "us" 's with "9" (which I think is another common Latin abbreviation). There are 100 occurrences.

The results would be: h1 = 3,99, h2 = 3,20.

So the tendence is that introducing abbreviation does increase both h1 and h2, not decrease it, as we would like.

What would decrease h1 is, e.g. expanding certain glyphs into sequences of other glyphs, which could still be visually detected and discerned from one another by a reader proficient in Latin. Returning to our original sample, let's expand all instances of "a" into "ci". There are 667 occurrences of "a", so we hope to see some effect.

The results are: h1 = 3,79, h2 = 2,99

So we have a decrease in both. Let's add some more expansion and replace "n" with "ii". I think that who knows the trick will still be able to read the original text from this. This is actually the same as the problem with interpreting Voynichese glyphs - i.e. is iin one glyph or three glyphs in succession.

Results: h1 = 3,54,  h2 = 2,93

So while this expansion has a distinct effect on h1 (which is now lesser than Voynich), h2 still behaves more sluggishly. What will significantly reduce h2 would be, of course, introduction of nulls. Let's add, in the current sample, a "0" after each "e". There are 1018 occurrences, so the result would be immediate.

Results: h1 = 3,65, h2 = 2,66

Let's proceed further and add a null character "0" after each "t" in addition.

Results: h1 = 3,62, h2 = 2.57.

This indicates what might be the trick with Voynichese - spaces may be effectively "nulls" (should be ignored), plus some nulls are added after most frequent letters (or, probably, since the concept of letter frequency was not known back then, just a good deal of nulls is added without relation to a particular letter, but instead according to some pre-defined rule).


RE: Character entropy of Voynichese - Anton - 03-06-2018

As a further example, I just found the "Bennet" transcription of f1r. It is not a literal transcription by Bennett, it is a transcription of mine but using the alphabet that Bennett adopted for his studies. This was part of the work that I began but then neglected.

You are not allowed to view links. Register or Login to view.
After normalization (remove linebreaks), we have h1 = 3.84, h2 = 2.14, h1-h2 = 1.7.

If we remove all spaces, we get h1 = 3.85, h2 = 2.34, h1-h2 = 1.51.


RE: Character entropy of Voynichese - Anton - 03-06-2018

If we additionally remove all o, which sometimes is suggested to be a null, then we have h1 = 3.79, h2 = 2.43, h1-h2 = 1.36. This is slightly better, but far not good (as compared with natural languages).

Technically, what one could do in this direction is try to remove different characters and see how it affects h1-h2, in an attempt to detect the possible null. (It's unlikely that a medieval scholar would use more than one character for a null). But the results will of course depend on transcription, so different transcriptions need be considered.