Could Voynich symbols encode "directions" on some alphabet table?
entropyOrInformation > 06-03-2025, 02:19 AM
Hello! This is my first time here. I'm very interested in probability, statistics, and information theory, and the Voynich manuscript intrigues me due to the strange statistical properties of the text. Since this text has, so far, appeared to not be any known cipher, I started thinking of ways to encode text that someone at the time may have thought of, and could possibly lead to strange statistics. My idea is this: what if the authors used a chart of characters in their native language and, from some starting point, wrote "directions" to the desired character using voynichese symbols. I believe this can explain some of the statistics, inclusing the commonality of "daiin", "daiiin", and other extremely common repeating phrases and word parts, and also the difference in statistics between different scribes.
To illustrate my point, I'll use a simple example. DISCLAIMER: I am not claiming this is the exact method of encoding used in the Voynich manuscript, simply that a similar type of encoding could have been employed. Let's construct a 5x5 table of English letters excluding "z" for now so we have a nice, neat table.
A B C D E
F G H I J
K L M N O
P Q R S T
U V W X Y
Also for simplicity, let's use English letters for our final encoding. Each will encode a "nearest neighbor" direction. Let's use
A = up
B = up and right
C = right
D = down and right
E = down
F = down and left
G = left
H = left and up
We now encode in the following manner. Each letter of the message becomes a word in the encoded text. The letters of the encoded words give "directions" to the letter of the message starting in the middle of the chart, i.e. "M". Note that the encoding here is not unique, this is crucial.
Let's encode the phrase "Hello this is encoded text". For now, let's use the rule that we write a shortest path from "M" to our desired character. One example of this encoding is:
A BB G G CC CD A B D B D BB C AA CC AB BB AB CD BB DE CD
However, this is far from the only way to encode this message. Let's say instead that we desire to create words we can pronounce in our encoded message. We could instead write:
A ACAC EGA AGE ECAC CEC A CA EC CA EC ACAC EB ACH ECAC AB ACAC AB AD ACAC DE CEC
This encodes the same message, but looks very different. Another feature of this type of encoding is that "loops" can be arbitrarily added to any encoded word in any place. For example, the encoded letters "ACEG" form a loop in the diagram, and can be added arbitrarily to any encoded word in any place and not change the meaning of the encoding. Perhaps "daiin" and "daiiin" are loops of some sort? Furthermore, a word can be arbitrarily long in the encoded message, as we can keep a path going as long as we like. Thus we could create fairly arbitrary word length statistics using this method. Additionally, different scribes could prefer different "paths" or ways of encoding, leading to different statistics and encoded word choice. Despite the many ways of encoding, it would be very simple to decode given access to the letter grid and knowledge of the encoded symbol's meaning.
If voynichese is an encoding similar to this, I would assume the characters themselves are more complex paths than the ones presented here, as there are many more than just eight. Perhaps the encoding grid also includes more than one instance of a letter, giving a larger grid and thus more possible paths.
Thanks for reading! Let me know what you think!