09-02-2026, 03:17 PM
I thought it may be genuinely worth debating this. I decided to post this here instead of in the talk area, even though i'm primarily talking about our working assumptions when discussing these issues, not proposing different, well thought out solutions. Either way...
It seems that when people talk about a possible subsitution, the counter-arguments tend to be:
-The entropy of the VMS does not match that of known natural languages, so a substitution cipher alone is not the solution
-If it was a simple substitution cipher, it would have been decoded long ago
I think it's worth checking these assertions from first principles, checking which assumptions have been made in order to make them.
Entropy
Did our choice of transliteration alphabet influence the calculated entropy?
In order to calculate the entropy of the VMS we must use some transliteration of the text. The entropy is then calculated using those transliterations.
Is it possible that there are influential choices made by every alphabet so far which are innapproprate?
EVA's repeated strokes e/c/i/n
Let's take this set of characters : eees aiin
[attachment=14018]
Lets assume for a moment that this IS plaintext language in a latin script, made of latin "a,e,c,i,m,n,u". If the scribe uses a flourish at the end of a word or for contractions (very common from manuscripts i've seen) it's possible that that sentence transcribes to:
cccc aiii / ecec aiii / eccc ain / eeec am / ccee aui / eeee aiu / ecce ani
as well as many many other configurations. We assume that it only ever says "eees aiin".
When we calculate the entropy, we see multiple i's and e's in the transliteration. After a set of i's, there is almost always an "n". We look at the text and assume that each case of aiiin is the same. Wouldn't this surely lead to unnatural or unusual entropy results?
If we were to do the calculations again, but for every "aiiin" we instead substitute an equally possible set of characters (aiim, anin, amii, aum) how would that affect the results?
Assumption that different symbols are different letters
I infered this above, but it's important too. When transliterating the VMS, we assume that the symbols are seperate. What if -depending on the position within a word or line or some other rule- the same letter is written two different ways? We look at q often here, noting that it only appears at the start of words.
What if t is equivalent to ql? or k is equivalent to m? Or perhaps s is simply an e (ē denoting a contraction)?
Im not arguing that this is the case, at least not now, but if it were the case how would the entropy be affected?
Applying current thinking to existing manuscripts
If we were to treat capital letters from normal manuscripts the same (assuming they are seperate "glyphs" with seperate meaning) how different would our analysis be of those texts? We may look at a capital A and say "Yes, this rare glyph only occurs at the beginning of words, never the middle or the end. Very unusual for natural language"
The same could be said for various minims like I discussed above. If we had no context for latin, wouldn't we assume that "minim" is really "iiiiiiiin" or "mmmi" or something? Its already difficult to transcribe those words with full context given!
Anyway. I'm posting this to primarily discuss which of our current assumptions are valid, and which may not be valid. If we've made an incorrect analysis somewhere, and that has become the bedrock of further discussion, it leads to people being politely dismissed straight away. We should make sure that our foundation is not made of sand.
It seems that when people talk about a possible subsitution, the counter-arguments tend to be:
-The entropy of the VMS does not match that of known natural languages, so a substitution cipher alone is not the solution
-If it was a simple substitution cipher, it would have been decoded long ago
I think it's worth checking these assertions from first principles, checking which assumptions have been made in order to make them.
Entropy
Did our choice of transliteration alphabet influence the calculated entropy?
In order to calculate the entropy of the VMS we must use some transliteration of the text. The entropy is then calculated using those transliterations.
Is it possible that there are influential choices made by every alphabet so far which are innapproprate?
EVA's repeated strokes e/c/i/n
Let's take this set of characters : eees aiin
[attachment=14018]
Lets assume for a moment that this IS plaintext language in a latin script, made of latin "a,e,c,i,m,n,u". If the scribe uses a flourish at the end of a word or for contractions (very common from manuscripts i've seen) it's possible that that sentence transcribes to:
cccc aiii / ecec aiii / eccc ain / eeec am / ccee aui / eeee aiu / ecce ani
as well as many many other configurations. We assume that it only ever says "eees aiin".
When we calculate the entropy, we see multiple i's and e's in the transliteration. After a set of i's, there is almost always an "n". We look at the text and assume that each case of aiiin is the same. Wouldn't this surely lead to unnatural or unusual entropy results?
If we were to do the calculations again, but for every "aiiin" we instead substitute an equally possible set of characters (aiim, anin, amii, aum) how would that affect the results?
Assumption that different symbols are different letters
I infered this above, but it's important too. When transliterating the VMS, we assume that the symbols are seperate. What if -depending on the position within a word or line or some other rule- the same letter is written two different ways? We look at q often here, noting that it only appears at the start of words.
What if t is equivalent to ql? or k is equivalent to m? Or perhaps s is simply an e (ē denoting a contraction)?
Im not arguing that this is the case, at least not now, but if it were the case how would the entropy be affected?
Applying current thinking to existing manuscripts
If we were to treat capital letters from normal manuscripts the same (assuming they are seperate "glyphs" with seperate meaning) how different would our analysis be of those texts? We may look at a capital A and say "Yes, this rare glyph only occurs at the beginning of words, never the middle or the end. Very unusual for natural language"
The same could be said for various minims like I discussed above. If we had no context for latin, wouldn't we assume that "minim" is really "iiiiiiiin" or "mmmi" or something? Its already difficult to transcribe those words with full context given!
Anyway. I'm posting this to primarily discuss which of our current assumptions are valid, and which may not be valid. If we've made an incorrect analysis somewhere, and that has become the bedrock of further discussion, it leads to people being politely dismissed straight away. We should make sure that our foundation is not made of sand.
