The Voynich Ninja

Full Version: How easy is it to create a cipher which is very hard to break?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
(15-07-2020, 11:02 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.
...
Obviously if I make a perfect copy that is of little interest, but it should be perfectly OK for me to copy bits and paste them together if and when I wish.

Again it seems unnecessary to prevent someone from looking at Voynichese as and when they produce their Voynichese like text.

These restrictions would certainly make things harder, but it seems to me to be completely unnecessary. The author of the Voynich could copy portions if he/she wished. The author of the Voynich could look at the rest of what he/she had already written. Why impose restrictions that did not apply to the author?...


Do it, however, you wish, Mark.

Just keep in mind that the bigger the bits, the less chance there is of demonstrating your assertion that filler text with Voynichese-like properties should be "[font=Tahoma, Verdana, Arial, sans-serif]relatively easy to produce".[/font]


[font=Tahoma, Verdana, Arial, sans-serif]Big bits do not make a distinction between meaningful or meaningless (filler) text, since we don't know yet whether VMS vords are meaningful or meaningless.[/font]
(15-07-2020, 11:02 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.
...
I think it is important to distinguish between manually produced text and automatically produced text such as Rugg and others have done.


If I create a system or method or rule-set for generating Voynichese-like text and sit down and write it it out with pen and paper, do you consider that to be "manually produced" text or "automatically produced text"?

If I take the same system or method or rule-set and program it into the computer and have the computer print it out, do you consider that to be manually or automatically produced?

Do you consider Hyde and Rugg's grille method to be automatic or manual?

Do you consider Timm and Schinner's method to be automatic or manual?


I am not sure which distinction you are making when you say manually or automatically produced text.
(16-07-2020, 02:02 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.If I create a system or method or rule-set for generating Voynichese-like text and sit down and write it it out with pen and paper, do you consider that to be "manually produced" text or "automatically produced text"?


Yes, it's very fair to ask for a clarification. I would call that[font=Tahoma, Verdana, Arial, sans-serif]"automatically produced text" as you are following a strict algorithm or rule set like a computer. Whether the algorithm is implemented by a computer or a human from my perspective is immaterial.[/font]

Quote:
If I take the same system or method or rule-set and program it into the computer and have the computer print it out, do you consider that to be manually or automatically produced?


I would say that is "[font=Tahoma, Verdana, Arial, sans-serif]automatically produced".[/font]

Quote:
Do you consider Hyde and Rugg's grille method to be automatic or manual?


Quote:
Do you consider Timm and Schinner's method to be automatic or manual?


I wouldn't like to say without having studied their work in more detail.

Quote:
I am not sure which distinction you are making when you say manually or automatically produced text.

I would be inclined to use the term "automatic" when it is the result of a clearly designed rule set and can be implemented by an algorithm and potentially implemented by a computer. I use the term "manual" when talking about text produced by a person as a result of a mixture of conscious, semi-conscious and unconscious processes. This way of producing text is very hard to convert to an algorithm. I am a believer in strong AI, so of course in principle I believe this can be converted to an algorithm as can any neural processes, but in practice this would seem very difficult to near impossible. I am not inclined to the view that the author of the Voynich used tools, like a cardan grille, or a formal algorithm or rule set to generate the text in the Voynich, but rather a more ad hoc approach.
Mark Knowles Wrote:...
I would be inclined to use the term "automatic" when it is the result of a clearly designed rule set and can be implemented by an algorithm and potentially implemented by a computer. I use the term "manual" when talking about text produced by a person as a result of a mixture of conscious, semi-conscious and unconscious processes. This way of producing text is very hard to convert to an algorithm.


If you feel the VMS may have been more ad hoc, rather than (for example) rule-based, you should take a look at some of the examples that were submitted to Marco when he asked volunteers to create some pages of meaningless text. Meaningless text would qualify as "filler" text.

They are quite varied and there are none that resemble VMS text:

You are not allowed to view links. Register or Login to view.
[attachment=4565]To the PDF file ( Mosaic )
Basic cryptology:
Signs, symbols, images, are only meant to cause confusion. Thus, the VM signs can be assumed to be intended to simulate a foreign spelling and language.

In the example mosaic it is quite easy to transform it back into a known string of characters.
To keep it simple, I also say that it is a 1 to 1 encryption.
The original text was surely written by hand.
(12-07-2020, 09:15 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.If I was creating a hard to crack cipher I would start with a preponderance of filler text, the more filler the harder to crack. However it would be interesting to see how others might go about producing a difficult to crack "simple" cipher.

This is a fun thought experiment, Mark. 

Just so we're clear, your use of the word cipher here implies something written to convey information reliably to a separate person from the writer, correct? The only reason I ask is because "ciphers" and "codes" share a semi-porous border with "memory aids". I'm a Freemason, and have encountered this sort of "encoding" in learning the rituals for the Craft. The books of ritual and ceremony that Masonic lodges use are written in what appears to be a code to outsiders, but is more of a vague memory aid to Bretheren who have heard the rituals spoken many times. Both of these perceptions are intentional; Freemasonry is at its heart a Hermetic mystery school — the impact of the Craft is in the order and pace at which it is revealed to initiates. So having an uninitiated person unable to read the ritual, but an initiate able to read these books laying around the Lodge, is most proper.

When the writer of a code has only his future self as the intended audience, he can encode it in any idiosyncratically abbreviated form, so long as it contains enough information to cue the right associations in his mind, reliably every time he needs the information. I've been accused many times of taking liberties with the term "cipher". I get the sense that calling an idiosyncratic memory aid for oneself, which is impenetrable to outside readers, a "cipher", is putting a very short skirt on this word. Like a cipher it hides information. But it isn't a cipher, because it's not a systematic form of communication between two humans.

I think I'd go the opposite way from you, and have the information extremely pared down. It'd be as lossy a compression as the odds of misunderstanding would bear. Keeping my audience / intended recipient(s) in mind, I'd make up for this loss of informational density by relying on shared cultural capital. Compare rebuses and Cockney rhyming slang. These encodings involve a loss of information density, and they rely on shared cultural capital and a bit of human ingenuity and imagination to fill in the gaps. Here goes:

  1. Simplify the phonology of and syllable structure of English, the way a lot of English-based pidgins do. Write your cleartext in Simple English, and apply the simplified phonology and syllable structure to it. Nowadays I'd probably use a text-to-voice app to read this out loud, to make sure I could still comprehend it, and there were no problematic ambiguities. Adjust the wording as needed if such problems come up.
  2. Calculate the most frequent ~200 syllables in a large sample of Simple English Wikipedia text. Apply your phonology and syllable structure simplification algorithm to this rank list. Collapse the list by eliminating duplicates created by your algorithm. Crop this list keeping only the top 100.
  3. Make a table with 20 consonants on one axis and 5 vowels on the other.
  4. Fill in the table, with each syllable assigned a consonant-vowel bigram that normally sounds nothing like the syllable it encodes.
  5. Replace each syllable in your text with the corresponding bigram from your table. If your cleartext contains a syllable for which you have no ciphertext bigram, either choose the closest sounding one, or insert a bigram that you've chosen to designate as a wildcard. Or reword your cleartext, trying to convey the message using only the most common 100 syllables of your lossily-compressed English phonology. (I'm really curious to see how hard this will be.)
  6. Compose a 100-line poem or song. Make each line's first word one that begins with a consonant from your table, the second word beginning with the vowel sharing the same square, and the final word ending with the Simple English syllable encoded by that consonant-vowel pair. I'd make it a coherent but memorably ridiculous poem, with regular meter and rhyme, about a topic having nothing to do with my cleartext. I'd hand-write the poem in a greeting card ostensibly as an artistic gift to my message recipient. Even better if the topic of the poem touched on some sort of joyous occasion the recipient recently celebrated, and used a great deal of clichéd set phrases. Give the "celebratory poem" to the recipient in advance of the encrypted message.
  7. Give the recipient the directions for extracting the cipher key from this poem under separate cover.
  8. Once you're fairly sure the recipient has the poem and is clear about how to use it, send the encrypted message.
I predict this method would produce a ciphertext with a very rigid syllable structure (CV). I could see it having some reduplication and quasi-reduplication, due to the lossy compression of the cleartext's phonology. I think average word length would be shortened, since every syllable is represented as two characters, no more no less. I think the Shannon entropy values for the ciphertext would be significantly lower than those of the cleartext.

As I've mentioned, this type of encryption relies on several layers of murkiness used together, and relies on human cognition to penetrate each layer of murkiness. My aim was for these interpretive steps to be fairly easy for an informed human to execute, but fairly hard for a computer, or an uninformed human. I'll leave it up to Marco and Nablator to render judgement on how close to that aim I get, though  Big Grin ). The biggest downside is the potential for the message to lose too much information in the encryption process to be reliably recovered by the recipient. I imagine it would work OK for some cleartext messages and subject matters, and not so well for others.

I'll create a separate thread and do a demonstration of my encryption method some time next week, if anyone's interested (Kids are away at camp, woo-hoo!).
On with the Mosaic case.
Now, assuming that there are known encryption systems. In this example it is a simple Caesar encryption, the only thing left to understand is the language. Now one should believe that it is simple.

Those who do not know the Caesar cipher technique. Here the letters in the alphabet are simply moved. From A=C, B=D, C=E....
Simple and way before the time of VM.
Back to the mosaic case
The language used in this riddle is German. Encoded with Caesar, shift 3.
The original text can be found here:
You are not allowed to view links. Register or Login to view.

Someone who knows German will quickly discover that
there aren't many words in the dictionary. No trace of grammar. German is not always German.
It doesn't take much not to understand it.

This little attempt is only to show what I can expect behind the VM-glyphs.
I'm not sure, but I believe that this difference of the way of writing after 500 years to today can be transferred also to other languages. Italian, Spanish, French...etc.

Here I also set the link to Torsten's article, where the details are well described.
You are not allowed to view links. Register or Login to view.

Thanks to Rene for the link to the letters.
Interesting that the different German dialects reveal the region of origin together with the sender.
Makes it much easier

Translated with You are not allowed to view links. Register or Login to view. (free version)
Try this.

[Image: SLWbN.jpg]

This is a monogram of the entire Roman alphabet and Arabic numerals. Watch You are not allowed to view links. Register or Login to view. to see each character outlined in turn.

A code could be made out of some sort of directions for where to locate each character in this diagram, each label obviously unique to the character it encodes. I'd want the labels to be brief and quite easy to use for anyone familiar with the above diagram and looking at it in front of them, but not detailed enough as to allow someone to allow someone who'd never seen the diagram — and had no idea what it was — to easily catch on. Maybe the beginning and ending coördinates for tracing the letter in the diagram, along with the number of line intersections traversed in tracing the letter between these two points.
if you encoded a paragraph with the beginning and ending coordinates of a shape inside a monogram, the frequency distribution of each coordinate pair and its location within words, would match the distribution and location of the letters it encodes in whatever language it is.


In other words, a b c and 101-102, 54-32, 21-10  will be the same in how they will be distributed within a paragraph regardless of how you write them (letters, numbers, coordinates, unique symbols), and that gives it away. It's still a substitution cipher.
Pages: 1 2 3 4 5