Anton > 16-03-2016, 02:37 PM
Quote:However, the H2(max) depends tremendously on m, the size of the character set chosen. For Voynich text, Currier has 36 characters and Basic Frogguy has 23 characters. Characters that are hardly ever used have little effect on h1 and h2, but could make a tremendous difference in H2(max). Therefore, this measure was not used.
Davidsch > 18-03-2016, 11:30 AM
Anton > 18-03-2016, 12:15 PM
ReneZ > 18-03-2016, 12:19 PM
Davidsch > 18-03-2016, 04:52 PM
Sam G > 19-03-2016, 05:44 PM
Quote:There aren't many weapons in the arsenal. If the VMs is a cipher, then
it is a cipher which _lowers_ the entropy of the text. The only cipher
that can do that (Jim Gillogly, please correct me) is a cipher that
encodes single characters of the plain texts as sequences of characters,
or whole words and sentences. E.g. cat -> cloakarmtower (c > cloak, etc.)
Even so, the cipher has to be a bijection: c becomes cloak only and not other
words. There is another possibility: lots of nulls. It is much like
cat -> cloakarmtower, except that you are allowed any number of words
to encipher each letter _provided that_ they are built on strict, narrow
patterns. I'll take French "javanais" to illustrate this. The rule is:
insert "av" before the first vowel of every syllable. Thus: "bonjour"
-> "bavonjavour". Modify the rule to: insert "av", or "ov", or "ugl"
and you get a cipher text with a lower entropy than the plain text.
In that case, the VMs is very short, and the labels must be ignored.
Quote:Besides these cases, some ambiguous ciphers such as the Keyphrase
can lower the entropy. As an English example, you can have:
plaintext: abcdefghijklmnopqrstuvwxyz
ciphertext: THEPRESIDENTSPEAKSNONSENSE
In this case "my hovercraft is full of eels"
becomes "SS IESRSESTEO DN ENTT EE RRTN"
Note the excess of S's. The Keyphrase will frequently produce
runs of three or four of the same letter, and occasionally
adjacent ciphertext words which represent different plaintext
words. However, the Keyphrase and other entropy-reducing
ciphers I know (except those that use lots of nulls, as Jacques
points out) also reduce the size of the alphabet.
Unless one is unlucky, the recipient will be able to get most
of the words right given the mapping, as will the cryptanalyst
given enough material.
Anton > 19-03-2016, 06:19 PM
Torsten > 19-03-2016, 06:37 PM
(19-03-2016, 05:44 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Really, there is no way of escaping the conclusion that the low entropy is simply an intrinsic part of the content, not the result of some "cipher mechanism". The most likely explanation for that is that the VMS is written in a language which has low entropy, i.e. that has a relatively rigid phonotactic structure. From that point of view, there is nothing "anomalously low" about the entropy at all, since there exist languages with lower entropy, i.e. with a more rigid phonotactic structure than we find in the VMS text.
Sam G > 19-03-2016, 07:05 PM
(19-03-2016, 06:19 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Even leaving aside the fact that the "anomalously low entropy" of Voynichese is something at least not consistently proven, from the viewpoint of formal logic the entropy's value, be it high or low, says nothing about whether the text is enciphered or not.
Just because a message in a language which has low entropy may have been enciphered.
(19-03-2016, 06:37 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.(19-03-2016, 05:44 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Really, there is no way of escaping the conclusion that the low entropy is simply an intrinsic part of the content, not the result of some "cipher mechanism". The most likely explanation for that is that the VMS is written in a language which has low entropy, i.e. that has a relatively rigid phonotactic structure. From that point of view, there is nothing "anomalously low" about the entropy at all, since there exist languages with lower entropy, i.e. with a more rigid phonotactic structure than we find in the VMS text.
The weak word order can only mean that the words are not ordered by a grammar as we know for natural languages. Therefore you would have to assume a language without grammar.
Quote:I suspect that there are rules we are not aware of governing word order in the VMS, so that (for example) in some cases we might see a pair of words A B, but in a different context we will see same the pair B A. German, for instance, has some rules like this that we don't have in English, but it would be incorrect to say that German has "weak word order". So the "weak word order" in the VMS might also be only apparent. The problem is that we don't know what the rules are.
I think the "writing style" of the VMS also probably plays a role here. Many people tacitly assume that a prose style similar to that which we use to write English must be employed in the VMS, but this isn't the only possibility. The writing style is probably "weird", too. This affects a number of considerations about word order, repetitiveness, lack of repeated long phrases, etc.
Torsten > 19-03-2016, 07:32 PM