(24-10-2023, 01:22 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.The 7 characters example does not show how the whole cipher is supposed to work. Also, T is encoded as yqo1, E is encoded as dqo1. But while y.qo appears consecutively in the cipher text, E is rendered as dy.qo, with the additional symbol y inserted in the sequence. I must be misunderstanding something.
For this particular scheme, there are two groups of characters: (o, q, y) and (k). Starting from the leftmost character in the message you proceed as follows: find the next character of the same group, asses the distance between two characters (1, 2 or 3, corresponding to ~ 2, 4 or 6 characters apart), using the distance and the pair identify the plaintext characters. Discard both characters and proceed. Since the minimum distance for a pair is about 2 characters wide, no immediately adjacent characters form pairs (two characters with no other characters or space between them never pair up, there is no such thing as distance 0 in the cipher). dy.qo is parsed as dqo1, because y goes immediately after d, and so can't belong to the same pair. I think I wrote something about this in the original post.
Quote:Also, I don't understand how the sequence A S T should be encoded. I am leaving out nulls and representing 'qo' as 'q' (I understand that qo is treated as a single symbol?).
Many possible ways.
1) Obvious, not very efficient: k...kk.....ky.qo, or k...k.........k.....k.....y.qo or any other way where sequences are just spaced out.
2) A bit more efficient, k...kky.qok
3) Using the property of adjacent characters never pairing up: kk.ykqok
Note that for comfortable reading for a handwritten script some stylistic variations (round/angled/slanted body, longer/shorter extending elements) can be used to let the reader skip terminating characters, without keeping track of them, or to provide visual clues for matching up the characters via specific angles or lengths or character elements. E.g., using italics to mark the terminating characters. Repeating the encodings above:
1) Obvious, not very efficient: k...
kk.....
ky.
qo, or k...
k.........k.....
k.....y.
qo or any other way where sequences are just spaced out.
2) A bit more efficient, k...
kky.
qok
3) Using the property of adjacent characters never pairing up: kk.y
kqok
Or using extra cues (cedille here) to mark matching pairs of characters:
1) Obvious, not very efficient: ķ...ķk.....ky.qo, or ķ...ķ.........k.....k.....y.qo or any other way where sequences are just spaced out.
2) A bit more efficient, ķ...ķky.qok
3) Using the property of adjacent characters never pairing up: ķk.yķqok
Note that these adjustments are not required to properly read the cipher, they just help to read it much faster and write it with fewer errors.
We managed to encode 3 letters using only two different pairings (we could replace 'qo' with 'y' in all examples, getting kk.y
k.
yk for the last encoding) and 8 loci. Generally, this cipher roughly preserves the length of the plaintext in ciphertext, when using the same number of pair marks as the plaintext alphabet size.
Quote:In general, this system seems to be quite complex to encode and decode. Anyway, I don't think that distance encoding decreases entropy or makes repeating words more likely. Of course, nulls (if added with fixed criteria, rather than randomly as effective cryptography requires) and even more so verbose elements (e.g. "qo" as a single symbol) do lower entropy. A cipher that makes large use of distance 1 sequences basically is a verbose cipher (e.g. encoding S as y.qo) and typically would reduce entropy, but I don't see the added value of the complexity of a proper distance cipher like this.
I cannot comment on its complexity, but I expect that reading off the page at a speed of 2-3 characters per second should be possible after some training. Since the maximum pair distance is about 6 characters wide, it's short enough to not require saccades while reading, basically perceiving pairs as single entity.
This encoding does decrease observed character-based entropy, especially if distances (space counts) are not preserved in the transliterations. Compare kk.yk.yk and AST. It is verbose in the sense that at least 2 characters are required to encode one source entity. It is not verbose from purely information theoretical standpoint, since its density of encoding is comparable to the plaintext. A 9 codes x 3 distances version that I show in my article can encode a one letter of the source alphabet of 27 characters per code pair. With a script like Voynichese, where characters are easily split into basic elements, which support a large proportion of all possible combinations, it's could be possible to literally stack pairs on top of each other, and produce approximately 1 character on ciphertext per one character of Latin plaintext on average.
(24-10-2023, 03:38 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Thank you, interesting idea. I thought that nulls (e-sequences in this case) carried no information, but I am probably wrong.
If you mean the picture I posted for qokeedy.qokeedy interpretation, then nulls are nulls. They don't carry any information.
(24-10-2023, 04:53 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.The longest chain of qoke+dy, f75r.38:
Code:
qokeedy.qokeedy.qokedy.qokedy.qokeedy
+----+ qod3
+-------+ kk3
+--+ yqo1
+---+ dqo1
+-----+ yd1
+------+ kk2
+--+ yqo1
+---+ dqo1
+------+ yd2
I didn't mean for my example to literally apply to the Voynich manuscript, but this one looks nice
