![]() |
Character entropy of Voynichese - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: Character entropy of Voynichese (/thread-148.html) |
RE: Character entropy of Voynichese - Anton - 16-03-2016 Actually, what I meant here is character entropy as a general parameter, not related to any specific language, but to natural language in general. There might have been extinct languages, and they may have utilized less letters in the alphabet. Or this might have been an invented script for a language unknown to the scribe. I simply leave the validity of these assumptions out of scope. The whole entropy discourse in respect to the VMS, to the extent which I am acquainted with, is a bit fuzzy and, I am afraid, this confuses many researchers who are not acquainted with the information theory. To the consideration that I explained above, I would like to add (it seems that this fact is not well understood) that direct comparison of the character entropy between languages with different numbers of letters in the alphabet makes little sense, because entropy will depend upon the number of the letters in the alphabet. In other words, character entropy of English and character entropy of Hawaiian are not directly compatible, for to make any conclusions out of there. I guess what should be compared rather, is the degree (e.g. expressed in percentage) in which the character entropy reaches the maximum possible entropy. The maximum possible entropy is observed when all characters are equally probable to appear - hence, it depends on the size of the alphabet exclusively. I have no access to Bennet's book (maybe anyone could provide a scan of the respective chapter if the copyright so allows) and don't know what comparisons were being made. As to the You are not allowed to view links. Register or Login to view., I would say that it is a bit wandering and, besides, I haven't time to examine it in detail, but he (albeit speaking of the 2nd order entropy), makes a strange statement (emphasis is mine): Quote:However, the H2(max) depends tremendously on m, the size of the character set chosen. For Voynich text, Currier has 36 characters and Basic Frogguy has 23 characters. Characters that are hardly ever used have little effect on h1 and h2, but could make a tremendous difference in H2(max). Therefore, this measure was not used. From this explanation, I fail to understand wherefore this measure was not used, and also it seems that it was not used for first order entropy neither. RE: Character entropy of Voynichese - Davidsch - 18-03-2016 After that comment i decided to re-read again what entropy actually means (You are not allowed to view links. Register or Login to view. and i came to the conclusion that this is not an exact science but more relational question: what do you measure in relation to what? The practical added value of such research focused on entropy is only usefull if you are planning to write a paper. In solving any mystery in the VMS i think this has no practical use, does it? RE: Character entropy of Voynichese - Anton - 18-03-2016 It is exact science, and it is a useful measure, but as any measure it has to be used in proper way during comparisons. I think I will write a forum post explaining the notions of various entropies having been used in Voynich studies when I have time (probably next week). RE: Character entropy of Voynichese - ReneZ - 18-03-2016 It has at least one very practical use: in case a text is encrypted using a simple substitution cipher, the entropy values (all of them) do not change in the process. As a side note, entropy values are also not changed by writing backwards. Given that (some of) the entropy values in the Voynich MS text are anomalously low, either the source text has an anomalously low entropy, or the process to convert it to 'Voynichese' reduced it significantly. This puts considerable constraints on all proposed solutions. It is correct that the values are in a way relative, and should be interpreted as such. The single character entropy is somewhat low, but the charactar pair entropy is the most significant anomaly, especially in comparison to this single character entropy. This is not simply a problem of Eva. Eva was introduced two decades after Bennett pointed out the low entropy. RE: Character entropy of Voynichese - Davidsch - 18-03-2016 okidoki, i have developed a similar method, problem is that it was developed for visual comparisons. If i want to compare changes in the text, for example write backwards, i have to make a visual chart and compare it. Although i can make such and see it in some minutes, it is still not convenient when you want to look at many possible text-modifications. In that scenario where you want to automate possible changes (for example comparing Bacon cipher possibilities, see my other thread) and see if any of those changes result in anything worthwhile, this method could be of invaluable use. But didn't anybody perform such an task before? It seems to me that the NSA would try such immediately on the VMS-text. RE: Character entropy of Voynichese - Sam G - 19-03-2016 Basically the low entropy rules out the possibility that the VMS is a cipher, because very few kinds of ciphers can lower entropy, and these can all be excluded for other reasons. Here's an interesting bit from the list archives. Jacques Guy says: You are not allowed to view links. Register or Login to view. Quote:There aren't many weapons in the arsenal. If the VMs is a cipher, then Jim Gillogly's response: Quote:Besides these cases, some ambiguous ciphers such as the Keyphrase In the case of the VMS, the keyphrase thing would've been cracked already as it is basically a form of simple substitution cipher (and probably wouldn't lower the entropy enough anyway), and verbose ciphering (or the addition of verbose nulls) can be ruled out because the words in the VMS simply aren't long enough, and there aren't enough repeated 2+ word phrases to consider single plaintext words enciphered into multiple ciphertext words. The presence of single-word labels also poses a major problem for any scenario that involves treating the VMS words as anything other than words, or at least stand alone "units" capable of bearing meaning (as opposed to merely parts of verbosely enciphered words). Really, there is no way of escaping the conclusion that the low entropy is simply an intrinsic part of the content, not the result of some "cipher mechanism". The most likely explanation for that is that the VMS is written in a language which has low entropy, i.e. that has a relatively rigid phonotactic structure. From that point of view, there is nothing "anomalously low" about the entropy at all, since there exist languages with lower entropy, i.e. with a more rigid phonotactic structure than we find in the VMS text. RE: Character entropy of Voynichese - Anton - 19-03-2016 Even leaving aside the fact that the "anomalously low entropy" of Voynichese is something at least not consistently proven, from the viewpoint of formal logic the entropy's value, be it high or low, says nothing about whether the text is enciphered or not. Just because a message in a language which has low entropy may have been enciphered. RE: Character entropy of Voynichese - Torsten - 19-03-2016 (19-03-2016, 05:44 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Really, there is no way of escaping the conclusion that the low entropy is simply an intrinsic part of the content, not the result of some "cipher mechanism". The most likely explanation for that is that the VMS is written in a language which has low entropy, i.e. that has a relatively rigid phonotactic structure. From that point of view, there is nothing "anomalously low" about the entropy at all, since there exist languages with lower entropy, i.e. with a more rigid phonotactic structure than we find in the VMS text. The weak word order can only mean that the words are not ordered by a grammar as we know for natural languages. Therefore you would have to assume a language without grammar. RE: Character entropy of Voynichese - Sam G - 19-03-2016 (19-03-2016, 06:19 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Even leaving aside the fact that the "anomalously low entropy" of Voynichese is something at least not consistently proven, from the viewpoint of formal logic the entropy's value, be it high or low, says nothing about whether the text is enciphered or not. Well, it can't be a ciphertext of any language which has a higher entropy than the VMS text (which rules out all European languages and many others), and it places very strong constraints on ciphers that could have been used on low entropy languages. (19-03-2016, 06:37 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.(19-03-2016, 05:44 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Really, there is no way of escaping the conclusion that the low entropy is simply an intrinsic part of the content, not the result of some "cipher mechanism". The most likely explanation for that is that the VMS is written in a language which has low entropy, i.e. that has a relatively rigid phonotactic structure. From that point of view, there is nothing "anomalously low" about the entropy at all, since there exist languages with lower entropy, i.e. with a more rigid phonotactic structure than we find in the VMS text. I've responded to this point of yours before: You are not allowed to view links. Register or Login to view. Quote:I suspect that there are rules we are not aware of governing word order in the VMS, so that (for example) in some cases we might see a pair of words A B, but in a different context we will see same the pair B A. German, for instance, has some rules like this that we don't have in English, but it would be incorrect to say that German has "weak word order". So the "weak word order" in the VMS might also be only apparent. The problem is that we don't know what the rules are. RE: Character entropy of Voynichese - Torsten - 19-03-2016 Quote:I've responded to this point of yours before: I've responded to your point that there is some word order in the VMS (see You are not allowed to view links. Register or Login to view.): Quote:"There are only 35 word sequences which use at least three words and appear at least three times. Only for five of these sequences is the word order unchanged for the whole manuscript, whereas for 30 out of 35 phrases the word order does change." (see You are not allowed to view links. Register or Login to view. as Timm 2014: p. 3) |