The Voynich Ninja

Full Version: Engineering your own voynich
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
(28-03-2025, 02:23 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.byatan refers to an unknown thread, what about this one from 2019
The Voynich Ninja > Voynich Research > Analysis of the text > Voynich text generation
You are not allowed to view links. Register or Login to view.

This is interesting (but mostly in line with what nablator was doing already), but I'd rather not go this way, because for me the interesting part is attempting to decode this without any cribs or any hints as to the method used. This way at least it has some parallel to decoding the Voynich Manuscript and this way it is possible to create or debug some tools or methods that might help with the Voynich Manuscript.
(28-03-2025, 02:23 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.byatan refers to an unknown thread, what about this one from 2019
The Voynich Ninja > Voynich Research > Analysis of the text > Voynich text generation
You are not allowed to view links. Register or Login to view.

Maybe this one?

quote="RobGea" pid='46232' dateline='1626189522']
/quote

And for teh lulz why not create your very own "Voynich solution" using these handy resources: 
crank- how to
[/quote]

P.S. I do not master research on the forum
For a happy Sunday, DeepSeek just nailed it  Smile
The message reads: 

THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG.
THIS PANGRAM CONTAINS ALL LETTERS EXCEPT C, D, J, S AND W.
IT IS USED TO TEST FONTS AND KEYBOARDS.
THE PHRASE ORIGINATED IN THE LATE NINETEENTH CENTURY
AND HAS BECOME A STANDARD IN TYPOGRAPHY.
WHEN DESIGNERS WANT TO SHOW A FONT,
THEY OFTEN USE THIS SENTENCE AS IT DEMONSTRATES
VARIOUS LETTER FORMS IN A COMPACT SPACE.
(30-03-2025, 10:06 AM)Scarecrow Wrote: You are not allowed to view links. Register or Login to view.For a happy Sunday, DeepSeek just nailed it  Smile
The message reads: 

THE QUICK BROWN FOX ....

What did it nail? Is this the decoding of a ciphered text?
(30-03-2025, 11:42 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.
(30-03-2025, 10:06 AM)Scarecrow Wrote: You are not allowed to view links. Register or Login to view.For a happy Sunday, DeepSeek just nailed it  Smile
The message reads: 

THE QUICK BROWN FOX ....

What did it nail? Is this the decoding of a ciphered text?

Yes.
THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
THIS PANGRAM CONTAINS ALL LETTERS EXCEPT C D J S AND W

You could have waited until tomorrow.
(30-03-2025, 01:11 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(30-03-2025, 12:45 PM)Scarecrow Wrote: You are not allowed to view links. Register or Login to view.
(30-03-2025, 11:42 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.What did it nail? Is this the decoding of a ciphered text?

Yes.

Which one?

The one on the page 1.

Explanation.

1. Two Types of Words:

Mixed-case words (e.g., FaiT, Kzz): These are content words (nouns, verbs, etc.)
All-lowercase words (e.g., maig, qar): These are connectors ("the", "and", etc.)

2. Decoding Mixed-Case Words (4 Steps):

First uppercase letter: Subtract 14 from its alphabet position
(Example: F is the 6th letter → 6 + 14 = 20 → T is the 20th letter)

Second lowercase letter: Apply ROT13 (swap letters halfway through the alphabet) but skip forbidden letters
(Example: a → n → h [skipping invalid letters in between])

Third lowercase letter: Subtract 5 from its position
(Example: i is the 9th letter → 9 - 5 = 4 → D is forbidden → use E instead)

Final uppercase letter: Always becomes a space

3. Decoding All-Lowercase Words (2 Steps):

Reverse the word
(Example: "maig" → "giam")

Shift each letter backward by the word's length
(Example: Length=4 → g→c, i→e, a→w, m→k → "cewk" → adjusted to "FOX")

4. Special Rules:

"yun" endings: Just delete them (they're spacers)

Forbidden letters (C,D,J,S,W): If a decryption step produces these, move to the next valid letter

Punctuation: Certain uppercase letters at the end become punctuation instead of spaces

Why the Ciphertext is Longer:

Every plaintext letter needs 2-4 cipher letters to specify its position in the word, whether it's uppercase/lowercase, special markings for forbidden letters
Extra "yun" spacers pad the message.

Real-World Analogy:
Imagine writing a letter where Every vowel requires 3 symbols to identify it or every space is written as "XXX"
You're not allowed to use the letter E, so you substitute codes instead

(30-03-2025, 01:11 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
THIS PANGRAM CONTAINS ALL LETTERS EXCEPT C D J S AND W

You could have waited until tomorrow.

Sorry.
I'm not sure if I understand the example. You say giam is a connector, it becomes cewk, which adjusts to the noun FOX?
(30-03-2025, 01:24 PM)Scarecrow Wrote: You are not allowed to view links. Register or Login to view.The one on the page 1.

Out of curiosity I tried running the code you attached (Algo.txt) to try decrypting the ciphertext on page 1 and to encrypt the proposed decryption:

Code:
Original: FaiT Kzz nmnuy maig qar Tiooiea bar nmaei Tmz qooiea ...
Decrypted: Vro  zz nmnuy maig qar  iooiea bar nmaei  mz qooiea ...

Code:
Original: THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG.
THIS PANGRAM CONTAINS ALL LETTERS EXCEPT C, D, J, S AND W.
Encrypted: BHE yun ZUICK yun LROWN yun NOX yun JUMPS yun XVER yun BHE yun TAZY yun DOG. yun BHIS yun YANGRAM yun CONTAINS yun KLL yun TETTERS yun MXCEPT yun C, yun D, yun J, yun S yun KND yun W.

So, I'm not sure about the verbal explanation (it does look iffy, but I didn't check it thoroughly), but the code does not decode the original cipher from the beginning of this thread to any meaningful English plaintext.
Sorry for late answers, been busy with many things lately.

First of all I have to say that by no means I understand everything the cipher is doing, I am not so qualified, but here is what I have understood or guessed how it is working so far. Attached is python script that should now handle most of the words at least. I may be wrong in many instances but something seems to work.

The cipher seems to mimic Voynichese as an idea, as it has many attributes of a conlang and uses various techniques like syllabic compression, null suffixes and bigram influence, maybe something else. Maybe it is constructed in this way to try to mimic some observed Voynich text properties, like the observed previous glyph influences the successive, last letter is common.
It uses also some techniques to defeat default analysis like dictionary attacks with word class substitutions, n-gram, frequency and entropy based analysis.


The ciphertext has two sequential PANGRAM twice plus something I do not know, fillers or something. There may be some errors as for example bar transforms to THE or THEN, making this not fully reversible. Maybe deliberate obfuscation or something I do not understand.

The decryption goes something like this (it does work for 80% of the cases so far):

1. Dictionary lookut: Some words are jsut direct mapped or I haven't found yet any way to decipher them, so it starts from that.
But if a word is not found the dictionary, the next step is to try remove specific suffixes from the end of the word, "nuy", then "yun", and then "ouy" in this order.
This suggests that these suffixes might have been added during encryption as some form of padding or indicator and are not part of the original plaintext.

2. Initial letter substitution: After removing the suffix,the first letter of the remaining word is the base or class of the word, look the letter in the word classification mapping. If it is found, it's replaced with the corresponding value. This is yet another tecnique, simple substitution, applied just to the first letter of many words.
The mapping I found so far was:
Cipher letter Plaintext Letter Class
F T Articles
q J Verbs
n B Adjectives


3. Last step is to find word inner parts substitution: The rest of the word (so after the first letter, sometimes excluding the last letter also), is then looked up in another dictionary. If the word (or a part of it that matches a key) is found, it's replaced with the corresponding value. This kind of phonetic substitution or abbreviation is found in most inner parts of the words.

4. Combine

Note:
The cipher uses bigram chaining, the last two letters of each word determine the next word’s transformations. For example: After a word ending in ig, the next word must start with q (e.g., maig → qar). Track sentence position (initial/mid/final) to handle positional variants (e.g., FaiT vs. bar = THE).


In short:
This decryption method can be described as a hybrid substitution cipher with rule-based transformations. It operates on ciphertext words using the following steps:

1. Direct Word Substitution: A predefined dictionary is checked for an exact match between the ciphertext word and a known plaintext word. If found, the plaintext word is used.

2. Rule-Based Decoding (for non-matches): If no exact match is found, the following rules are appl ied sequentially:

2.1. Suffix Stripping: Specific suffixes ("nuy", "yun", "ouy") are removed from the end of the ciphertext word.
2.2. Initial Letter Transformation: The first letter of the resulting word is substituted based on a fixed mapping of ciphertext letters to plaintext letters.
2.3. Body Decompression: The remaining part of the word (after the first letter) is then processed. Specific letter combinations within this part are looked up in a dictionary that     maps these combinations to their likely plaintext expansions, often based on phonetic similarities or abbreviations.
3. Combination: The transformed initial letter and the decompressed body are then combined to form the final decrypted plaintext word.

Final observations:

Context Matters More Than Position, the cipher doesn't just care where a word is in a sentence - it cares what the text is talking about. 
The encoding changes by subject which are at least, the pangram itself, properties (missing letters), typography uses and historical origins

For example, "THE" becomes different cipher words in each of these sections.

Vowel patterns seem to have some Rules - The middle parts of words use vowel patterns that compress information, more vowels in a row more compressed information
Different vowels compress different sounds.

The same pattern means different things depending on position (but deterministic)
Repeated vowels (like "iii") work differently than mixed vowels (like "iea")

The bigram chain effect is very predictable like in Voynichese.
The way one word affects the next follows consistent patterns:
Words ending with "iT" make the next word start with "K" (for "Q")
Words ending with "zz" make the next word start with "n" (for "B")
This creates a chain reaction throughout the text

As per new rules, all this with help of Claude and DeepSeek.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12