(12-07-2020, 09:15 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.If I was creating a hard to crack cipher I would start with a preponderance of filler text, the more filler the harder to crack. However it would be interesting to see how others might go about producing a difficult to crack "simple" cipher.
This is a fun thought experiment, Mark.
Just so we're clear, your use of the word
cipher here implies something written to convey information reliably to a separate person from the writer, correct?
The only reason I ask is because "ciphers" and "codes" share a semi-porous border with "memory aids". I'm a Freemason, and have encountered this sort of "encoding" in learning the rituals for the Craft. The books of ritual and ceremony that Masonic lodges use are written in what appears to be a code to outsiders, but is more of a vague memory aid to Bretheren who have heard the rituals spoken many times. Both of these perceptions are intentional; Freemasonry is at its heart a Hermetic mystery school — the impact of the Craft is in the order and pace at which it is revealed to initiates. So having an uninitiated person unable to read the ritual, but an initiate able to read these books laying around the Lodge, is most proper.
When the writer of a code has only his future self as the intended audience, he can encode it in any idiosyncratically abbreviated form, so long as it contains enough information to cue the right associations in his mind, reliably every time he needs the information. I've been accused many times of taking liberties with the term "cipher". I get the sense that calling an idiosyncratic memory aid for oneself, which is impenetrable to outside readers, a "cipher", is putting a very short skirt on this word. Like a cipher it hides information. But it isn't a cipher, because it's not a systematic form of communication between two humans.
I think I'd go the opposite way from you, and have the information extremely pared down. It'd be as lossy a compression as the odds of misunderstanding would bear. Keeping my audience / intended recipient(s) in mind, I'd make up for this loss of informational density by relying on shared cultural capital. Compare rebuses and Cockney rhyming slang. These encodings involve a loss of information density, and they rely on shared cultural capital and a bit of human ingenuity and imagination to fill in the gaps. Here goes:
- Simplify the phonology of and syllable structure of English, the way a lot of English-based pidgins do. Write your cleartext in Simple English, and apply the simplified phonology and syllable structure to it. Nowadays I'd probably use a text-to-voice app to read this out loud, to make sure I could still comprehend it, and there were no problematic ambiguities. Adjust the wording as needed if such problems come up.
- Calculate the most frequent ~200 syllables in a large sample of Simple English Wikipedia text. Apply your phonology and syllable structure simplification algorithm to this rank list. Collapse the list by eliminating duplicates created by your algorithm. Crop this list keeping only the top 100.
- Make a table with 20 consonants on one axis and 5 vowels on the other.
- Fill in the table, with each syllable assigned a consonant-vowel bigram that normally sounds nothing like the syllable it encodes.
- Replace each syllable in your text with the corresponding bigram from your table. If your cleartext contains a syllable for which you have no ciphertext bigram, either choose the closest sounding one, or insert a bigram that you've chosen to designate as a wildcard. Or reword your cleartext, trying to convey the message using only the most common 100 syllables of your lossily-compressed English phonology. (I'm really curious to see how hard this will be.)
- Compose a 100-line poem or song. Make each line's first word one that begins with a consonant from your table, the second word beginning with the vowel sharing the same square, and the final word ending with the Simple English syllable encoded by that consonant-vowel pair. I'd make it a coherent but memorably ridiculous poem, with regular meter and rhyme, about a topic having nothing to do with my cleartext. I'd hand-write the poem in a greeting card ostensibly as an artistic gift to my message recipient. Even better if the topic of the poem touched on some sort of joyous occasion the recipient recently celebrated, and used a great deal of clichéd set phrases. Give the "celebratory poem" to the recipient in advance of the encrypted message.
- Give the recipient the directions for extracting the cipher key from this poem under separate cover.
- Once you're fairly sure the recipient has the poem and is clear about how to use it, send the encrypted message.
I predict this method would produce a ciphertext with a very rigid syllable structure (CV). I could see it having some reduplication and quasi-reduplication, due to the lossy compression of the cleartext's phonology. I think average word length would be shortened, since every syllable is represented as two characters, no more no less. I think the Shannon entropy values for the ciphertext would be significantly lower than those of the cleartext.
As I've mentioned, this type of encryption relies on several layers of murkiness used together, and relies on human cognition to penetrate each layer of murkiness. My aim was for these interpretive steps to be fairly easy for an informed human to execute, but fairly hard for a computer, or an uninformed human. I'll leave it up to Marco and Nablator to render judgement on how close to that aim I get, though

). The biggest downside is the potential for the message to lose too much information in the encryption process to be reliably recovered by the recipient. I imagine it would work OK for some cleartext messages and subject matters, and not so well for others.
I'll create a separate thread and do a demonstration of my encryption method some time next week, if anyone's interested (Kids are away at camp, woo-hoo!).