The Voynich Ninja
Character entropy of Voynichese - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Character entropy of Voynichese (/thread-148.html)

Pages: 1 2 3 4 5 6 7 8 9 10


RE: Character entropy of Voynichese - Anton - 03-11-2016

I planned to write a high-level tutorial post on this back in spring and approximately half of it is already there in my drafts.

Hope I'll force myself to finish and post it Smile

That's not just maths but specifically a piece of the information theory.


RE: Character entropy of Voynichese - Koen G - 03-11-2016

That would be highly appreciated since my brain shuts down when it sees a logarithm but this seems like something crucial to fully understand when discussing Voynichese language.


RE: Character entropy of Voynichese - Anton - 03-11-2016

It just occurred to me that it might be useful if I post at once the portion that is written already. It does not come to entropy yet, but it deals with some preliminary issues (such as logarithms Wink ). When the rest is written, I'll just add it. Thx for reminding me Smile


RE: Character entropy of Voynichese - Davidsch - 03-11-2016

Very good analytical posting Anton.

Numbers
A standard book, contains always many numbers, like  chapter numbers,  line number, references, pages numbers and other counts for measurements like weight, age, number of trees, numbers of stars, cost of things etc. In my research I always remove any reference to a number that is written in digits. Because numbers are a pain in the ass when you want to research text: how to interpret them?   If you write numbers as text, like twelve hundred, it is still text and forms no problem.

That the whole entropy discussion is really an old and only very rough indication by log(n). In number theory, when you would use entropy you will be the laughing stock of the community for example.  It is something like going to use an old mercury thermometer to measure temperature.  But, the indication itself is nice to establish a definition for warm and cold.

The entropy itself can be made more complex by adding more variables from outside. But that makes it even more useless as a general indication: the more variables you add, the more specific it becomes and it becomes a worse indication as general measure. (You are not allowed to view links. Register or Login to view.)

For the Voynich manuscript clamping on to old theories, old methods and old ideas has never proven to be fruitful. My system is far better, simpler and more advanced than the entropy comparison.  But of course, I do no have the stamp of "renowned scientist", and therefore the chances that it will be picked up are incredible slim. For me that is not important, but I would really like it, if people would remain equally critical on these so called (old) scientists and their research.

For example present journalists & editors are not critical, 95% of all text is just copied and printed without a single effort of research.

Anyway, I'm looking forward to your piece, Anton


RE: Character entropy of Voynichese - Anton - 03-11-2016

here it is: You are not allowed to view links. Register or Login to view.


RE: Character entropy of Voynichese - Emma May Smith - 03-11-2016

(03-11-2016, 04:02 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.My further thought - a trivial one - is that (as already expressed above), even within the same language the character entropy will be alphabet - dependent on how the alphabet is constructed. So what one needs is to check variations in Voynichese entropies with different approaches to constructing the Voynichese alphabet. Actually if one manages to "construct" a Voynichese alphabet such as that differential entropies would be close to that of the natural languages, that would be a huge step to decrypting Voynichese.

This is a very interesting statement. I'm not certain if it is true but the possibility is intriguing. I would only add that we must have some reasons for modifying the alphabet in order to achieve a better set of entropies.

I know Rene has spoken in the past about how different inputs have changed the entropy measurements of the Voynich text, though even the 'best' was still outlying to natural languages.


RE: Character entropy of Voynichese - -JKP- - 04-11-2016

(03-11-2016, 04:27 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.By the way, neither of Bennett's calculations for natural languages does incorporate digits - which would add ten additional characters into the alphabet. Of course for texts such as "Hamlet" this is not crucial - since digits, if ever met with,  are very rare there and do not influence the end result. But what e.g. for an apothecary's notebook?


Yes, all good points.

On the latter point, most apothecarial manuscripts have numbers, as do many with remedies and healing charms. For what it's worth, most of the ones I've seen are single digits (when expressed with Arabic numbers).


RE: Character entropy of Voynichese - stellar - 04-11-2016

(25-01-2016, 09:28 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The entropy is anomalously low, independent of the transcription alphabet. Indeed, there are different results depending on which alphabet is used, but these differences are significantly smaller than the anomaly observed.
Bennett in the 1970's (who first noted the anomalous values) used something similar to Currier's alphabet. Certainly, Eva should not be used for this type of analysis.


Indeed, the way to arrive at something similar to a 'normal' language is by compressing the Voynich text, i.e. combining characters.


However, this is just part of the picture.

I have to agree with Rene that the VMS does suffer from low entropy.  What I noticed when I enciphered a paragraph from JKP's site using a number system is that when I hit the picture I could simply lower the character count yet it still would equal the vord number.  Also notice that when I used the Pythagorean number system table I managed to make it look like a language in just 1 paragraph thus having a frequency set.

Using this system it quite easy to shorten long words and still maintain the look and feel of a language.  The system I use is kinda of a grill like Gordon Rugg suggests.

An interesting comparison would be is to know the average length of a voynich vord and compare with my little paragraph to see if I'm on to something.  Another fact is that the Voynich Manuscript could be a copy of another book as it was easy for me to just make up glyph's and use the Pythagorean table.

Just looking at my paragraph it would seem that I have almost an exact match for 5.5 average.
let me do some math Smile
Here is my calculation for the paragraph.

1(8 letter word) 1 (7 letter word) 7(6 letter words) 6 (5 letter words) 8 ( 4 letter words) 9 ( 3 letter words) 13 ( 2 letter words) 7 (1 letter words)

52 total vords

52/1 + 52/1 + 52/7 + 52/6 + 52/8 + 52/9 + 52/13 + 52/7
___________________________________________________= 5.22 average

8

As you can see just from a simple sample, I'm almost dead on to the Voynich Manuscript!  Maybe there is something to be said about math and language combined and some sort of universal thinking using the two.
Quote: The average word length for voynich glyph's is 5.5.
You are not allowed to view links. Register or Login to view.


[Image: CwUoCi4VIAAKmHH.jpg]


RE: Character entropy of Voynichese - Anton - 05-11-2016

Quote:This is a very interesting statement. I'm not certain if it is true but the possibility is intriguing.

The problem is whether a positive result will be achieved Smile But anyway the straightforward check is to make the calculation and see what comes of that.

Quote:I know Rene has spoken in the past about how different inputs have changed the entropy measurements of the Voynich text, though even the 'best' was still outlying to natural languages.

Yes, and I believe there even are comparisons there on voynich.nu... I'm not sure that the checks that I propose were made though. And, once again, it matters what we compare with. If we assume that the VMS is not a Hamlet or a Bible, then comparisons with the Hamlet or the Bible are not decisive.

In any case, to make it clear, I don't believe that by manipulation of the alphabet we may come to a simple substitution cipher - other circumstances, such as weird "grammar", prohibit that possibility.


RE: Character entropy of Voynichese - Sam G - 06-11-2016

(02-11-2016, 09:14 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Only one section is dedicated to the VMS - that is, the last section 4.22 of the chapter. It is only nine page long, with one page dedicated to problems for students and two pages - to scans of two folios of the VMS. That is not what I expected from what I read about this book in the Internet - I expected much more volume to be dedicated to the VMS. However, the actual state of things is reasonable - the whole book is dedicated to solving tasks with computer, so the VMS is just one interesting illustration or application. It is not subject to any dedicated focus neither in the book on the whole , nor even in the chapter 4.

That said, four pages are dedicated to the brief history of the VMS (with focus on the names of Dee and R. Bacon) and the attempts to analyse it (Newbold, Brumbaugh). The names of Yardley, Friedman and Tiltman are mentioned, as well as articles of Oneil (sic!), Friedman and Tiltman.

So only three pages are left for the discussion of the statistical properties of the VMS, which is much less than I expected.

Any chance you could scan these pages (or at least the three most relevant ones) and upload them somewhere?  I know they're probably still copyrighted but I think you could call it "fair use".