How's this for a method of creating a text with oddly low entropy?
- Start with a plaintext in a language written in an alphabet.
- Take each word and put the letters in some fixed order, possibly alphabetical, possibly some other order.
- If there are now clusters of duplicates of the same letter, maybe remove duplicates, or maybe slightly change the order to move them apart.
- Remove the spaces between the words, and insert new spaces wherever you like.
- Replace the letters with symbols, using a simple substitution cipher. Small shifts are allowed for aesthetic reasons, such as to move "c" before a gallows to create a benched gallows.
Based on a not-terribly-reproducible analysis of the first 10k EVA-letters of the manuscript, I came up with "qptkfscheoldaginmry" as a guess at the order of the letters. I then manually read through the first 10k EVA-letters and inserted spaces where the letters "jumped back in the order", that is, at the supposed breaks between words in the plaintext. This was surprisingly subjective. For example, my rules allow "qotor" to be a single word containing two of whichever letter maps to EVA-o, with one of them moved before the "t" to avoid the double "o". But it also seems that "qo" is a common two-letter word. So should "qotor" be split into "qo" and "tor"?
The most common apparent two-letter words are (anagrams of): qo ty ky ch ol dy sy sh or da so.
The most common three-letter words are (anagrams of): cho chy sho shy dar tor kol tol kor tey she ody car cha.
The most common four-letter words are (anagrams of): chor tchy chol kchy shol shor dain chey shey char keey pchy.
I notice some patterns among the common (anagrams of) words: cho sho chy shy, chor chol shor shol, tor kor tol kol. It's certainly possible that a real language could have words like this, especially if the patterns are not as strict as they appear to be, as an artifact of the sorted anagram process. It's still striking, though.
There are some words that just contain several EVA-e: ee, eee, eeee. Roman numerals, with "iiii" instead of "iv" for some reason?
I attempted to match the common words here with (anagrams of) common words in Latin, Italian (modern; I couldn't find historical word frequencies), and a very small corpus of historical French. Sometimes it seemed promising for a while but eventually none of my attempts worked out.