The Voynich Ninja

Full Version: The VMS as a possible chain encryption ( Mod 23 ).
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
I asked in the You are not allowed to view links. Register or Login to view. if a made up chain cipher is easy to decrypt. Now I have written a Python script that decrypts a given Voynich word using the same method. The result is then mapped according to frequency analysis (EVA > Latin). Admittedly, this is a rather rudimentary approach, I am primarily interested in decoding with MOD 23. The method is simple enough to be considered and also much more effective than a simple substitution. It could be implemented practically with a letter disk (like Alberti disk).

Here is the python - script and a list of possibly unabbreviated words ( Stolfi ):

chesokchoteody [f68r1, outer ring, near the bottom]
oepchksheey [f93r, top line, but looks like half of a Neal key]
qoekeeykeody [f105r, which I’d note is possibly the original first page of Q20A]
soefchocphy [f102r2, right edge, but right on the fold, very hard to read]
ykcheolchcthy [f68v3, first word of second line]
shdykairalam [f106v, last word of a line]
shetcheodchs [f43v, first word of a line]

Code:
# Reduced alphabet: No J, U, W
alphabet = list("ABCDEFGHIKLMNOPQRSTVXYZ")  # 23 letters

def char_to_pos(c):
    return alphabet.index(c.upper()) + 1

def pos_to_char(p):
    p = (p - 1) % len(alphabet) + 1
    return alphabet[p - 1]

def chain_decrypt_verbose(ciphertext):
    ciphertext = ciphertext.upper()
    decrypted = ""
    table = []

    prev_cipher_pos = 0
    for i, c in enumerate(ciphertext, start=1):
        if c not in alphabet:
            decrypted += c
            table.append([i, c])  # Nur zwei Spalten für Nicht-Buchstaben
            continue

        cipher_pos = char_to_pos(c)
        plain_pos = (cipher_pos - prev_cipher_pos) % len(alphabet)
        if plain_pos == 0:
            plain_pos = len(alphabet)
        plain_char = pos_to_char(plain_pos)

        decrypted += plain_char

        table.append([
            i,
            c,
            cipher_pos,
            plain_char,
            plain_pos,
            prev_cipher_pos,
            f"({cipher_pos} - {prev_cipher_pos}) mod {len(alphabet)} = {plain_pos}"
        ])

        prev_cipher_pos = cipher_pos

    return decrypted, table

def print_table(table):
    print("\nDecryption Table:")
    print("-" * 90)
    print(f"{'i':>3} | {'Cipher':^9} | {'cᵢ':^4} | {'Plain':^9} | {'pᵢ':^4} | {'cᵢ₋₁':^6} | {'Computation':<30}")
    print("-" * 90)
    for row in table:
        if len(row) == 7:
            i, cchar, cpos, pchar, ppos, cprev, calc = row
            print(f"{i:>3} | {cchar:^9} | {cpos:^4} | {pchar:^9} | {ppos:^4} | {cprev:^6} | {calc:<30}")
        elif len(row) == 2:
            i, cchar = row
            print(f"{i:>3} | {cchar:^9} | {'-':^4} | {'-':^9} | {'-':^4} | {'-':^6} | {'(not a letter)':<30}")
    print("-" * 90)

def apply_fixed_substitution(text, from_list, to_list):
    mapping = dict(zip(from_list, to_list))
    substituted = ''.join(mapping.get(c, c) for c in text)
    return substituted, mapping

if __name__ == "__main__":
    text = input("? Enter ciphertext to decrypt (only letters A–Z, excluding J, U, W): ")
    decrypted, table = chain_decrypt_verbose(text)
    print(f"\n? Decrypted text: {decrypted}")
    print_table(table)

    # Substitution: Voynich → Latein
    voynich_order = list("OEHYACDIKLRSTNQPMFGXBVZ")
    latin_order  = list("IEAUTSRNOMCLPDBQGVFHXYZ")
    substituted, mapping = apply_fixed_substitution(decrypted, voynich_order, latin_order)

    print("\n? Substituted (Voynich → Latin):")
    print(substituted)

    print("\n?️ Substitution Map:")
    for voy, lat in mapping.items():
        print(f"  {voy} → {lat}")

A side note:
"shetcheodchs" was converted to "LDYIFEYNDUEO" using the method described. ChatGPT hallucinates "Fide leo, deus unde" from it by rearranging, omitting and adding letters, which means "Trust the lion - God is its origin". This is remarkable because the plants on You are not allowed to view links. Register or Login to view. ( sun and moon ) could very well be connected with the lion according to alchemical interpretation. So you could easily fall for ChatGPT if you believe what you want to believe Wink
(21-07-2025, 11:30 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I asked in the You are not allowed to view links. Register or Login to view. if a made up chain cipher is easy to decrypt. Now I have written a Python script that decrypts a given Voynich word using the same method. The result is then mapped according to frequency analysis (EVA > Latin). Admittedly, this is a rather rudimentary approach, I am primarily interested in decoding with MOD 23. The method is simple enough to be considered and also much more effective than a simple substitution. It could be implemented practically with a letter disk (like Alberti disk).

I think this encoding will increase the entropy.

As an experiment could you try encoding some very simple maybe even repetitive English (or Latin) text using this method? I suppose it will become something like qtopihfdsjnvmglcxwtaksdbnkfhdyf - pure randomness and few obvious patterns.
(21-07-2025, 11:52 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.As an experiment could you try encoding some very simple maybe even repetitive English (or Latin) text using this method? I suppose it will become something like qtopihfdsjnvmglcxwtaksdbnkfhdyf - pure randomness and few obvious patterns.

I am not sure if I have understood this correctly, but here is a Latin text excerpt and below it the corresponding encryption:

Capitulum primum De regulis sumptis ex parte elementorum nostro corpori occurrentium ab extra
supra Non debet esse terra cuius habitatio eligitur lutosa nec scenosa nec fetens nec limosa nec super quam virtus metalli sit superans nec sulphurea nec aluminosa nec in qua sint vel fuerint mortuorum cadauera nec malis herbis aut arboribus fecundata sed bonis puta vineis salicibus herbore pini similibus amicabilibus nostris corporibus Sunt enim que dam plante que multum inimicantur nobis sicut experientia docet de arbore nucum et de aliis multis arboribus vt persico populo arbore sambuco et ficu Terre igitur fecundate his arboribus non sunt bone habitationis supra Et similiter intelligatur de herbis Sunt enim quedam venenose herbe quibus terra fecundata non est conueniens habitationi
vt est elleborus esula solatrum mortale et similia Et per oppositum fecundata bonis herbis supra et odoriferis est conueniens habitationi nostre Amplius non debet esse terra cauernosa inter supra montes inclusa nec nimis montuosa Et terra melioris habitationis debet esse discooperta ab oriente a septentrione Nam illi duo venti sunt magis purificantes aerem illius habitationis Vnde et in mansionibus bonum est fenestras esse septentrionales et orientales non meridionales nec occidentales Et hoc verum est regulariter sed interdum contingit per accidens ventum septentrionalem esse deteriorem meridionali Possible enim est quod ventus transeat per aliquam partem septentrionis niuibus occupatam vel adustam vel minerosam vel lutosam vel corruptam aliquo modorum corruptionis supradictorum et sic ventus septentrionalis erit impurior et minis eligibilis Et similiter intelligitur de parte orientis Bonum est igitur interdum per accidens fenestras habitabilis terre seu spiracula esse a parte meridiei vel occidentis quamuis supra regulariter contrarium sit eligibile Vicinitas quoque aquarum dulcium currentium multarum et mundarum que in hyeme cito infrigidantur et in estate cito calefiunt Et similiter pratorum viridium non vmbratorum multis arboribus est res conueniens habitationi et adiutorium prebens

RBXMNMGFR NRHTSF TA EKYXQGK NMZTVLO TX RBFGM RMREKTVGLKX GRVXBM FQVQCGX HBTSYCHRSIHT DV BDEIR
VTPTD NZI YDVBC HLOT VBFKS MLBAD BKCRSCDSE KETHYZYC XVXHLT EKD GAFPBEN YDX EKLQBE OTN HYKVZH RYQ TSOTZ POYK QGLMLO BGHQLFV ZOP SRNSYGQT EKD GFAVSRXCL VBT DYXIZITYG QXP FP GFO RHRS AFA HGMQGQR EPTVTFKIV OYMVTAEN YDX IRMCF DINFVZ HGH QVNZDSLKN VBTSDRBCL OTI BMXMP LKLT BQBGXA DMGXPFYXA YDHALPV QGQG KAMCXMEDG PCRLTMCXMEDG QCFGLBE YINITZOGFI MLVX CMCO FEK ZHT PKSDEK BAF RQLMLY NYNAPIRCDCG QCTKN QGAZA FHDINDISTKS HSMRS HN XBSEIO ZYQPC HI YD MGXMP CBVXMP ZDVGLBSRV CD ZEIMCVG CNIHCN XBSEIO RBNFEYI OP YNGF GMQVB QETVTZ GMFEODMNS QGK SYPBFVNMP ALV ZYHI BMXC AIBQRBCRDNDG KIEIR YZ CRETOEFLP FPQXQLBOYZYC QX TAEXMP SRCD ISIV MLQFOB HNYDNZCH FLPHN EDSLKN OTZDM TASRCQABK TFP VZA SEONSDSZIM KSLBCLMCNYN
TV BEF LFAFYINMP VZYRB EPKSTZYK XHMNXQX CD GXIZSIR YZ TAE PLGRVLMLY FLEDNCLMV NZIZC AFKCRV ZYSYG MN ZNZDSBGLBE KNO HSDCHRHNYB ZHAPQABQCMC MYBCGM VHDYNMP ALV KPHNO TYBG HNRXF ZHGMQBMPZ OZAFK NMHMV HSDEKN DNGBADM XCV FVHYB NZIKITYG MN OTZDM ZEZOAETY VEXMNXYNZIZC QXOTV BEHN CRVOALGMQRB KC NRHNYZE N QXRSZIKOEPAF PZL BVPF TSE LQBCR VTEF RBOEH DCGXETNXGHNQ AFKPC RMGXVZ XFYNOYZOAKAD KTIO TV LV HQBETFPFYXA RDNMZ EHI QXGMPQVEH NQTA DIEFLVXBQCMVPVZ EF QVLQBCLFLO ZKT GMQGVLXGPKPS DIC NGAPEKTVEZEH NO MYQ YDHGS ZCD HNBATDHYZEI MRG XGHNRGFR LXGHYHVLM HNR BTNDRYHL RYHIHT YDZAFPQVLXGPKPC HLOT IOPVAPBFLY KPTKZOAKSND ZKNQGZSZ EOEQ XAB RQCQ YDNONQ RXFPSZHI EKO YRHZYGS OYCDIV ZEABGQRXMYHYB LBAPHGK VOHGCLMVH OTO YMLOPZL RYR ETEKOADMZ FLF AZALOYK QXQ KVAEDZAIV EZOFEP CNCNRQD XHMQPLMCNYNQ TSOSCQGABMQPC HI MCV CHRSRV ZEABGQRXMYHQLBE KOEF VHDCGXHM RS FVFVZ EZOCRKATKN ST YNAPKABGL BLMRMGXKABAE SZ TDHIO AETAKLBE XHRQD IMN DQGHGL BLMRXLKX RYC LEYNCHRV DISZCDHQT RBSIKSLBVLO PVAEK NSR VQGLTNMGP VZCH Q MVABG SZDSHYDS AFA LEYNCHRSIM DCLYXMP SRNRB FLZYRBFVXCG ALVXBKOEDP SIK PKANDVLFL RHBQBQRBE VTFXVB KBAINMZ NMGAPOB TSYCHRSIHT GFABKONA FG SRCQAEDP GFL BL IFLYD XMNZ OZGLBOESCMNMQ XY NY DGHQRY QGHS MVPVDSRCD IK NDPFAPQXB XBKLXBAM SINDRHGS DOZ FRKOYZKONA MLFGXA INFQVLDCF LOP TAD XHRQXGXCMP NXOEFOPFQBQ XY GVLKLXBQPC YCHAFPS

There are definitely word repetitions (as I said, the mapping is rudimentary). It is striking that the last letter in the preceding word must also be identical.

[attachment=11052]

The entropy actually increases a bit.
Text entropy calculation (online tool):
Before encryption 4.05089
After encryption   4.47790
Was the original entropy case sensitive? The text should be converted to uppercase or lowercase before the computation.

As for the repetitions, they are expected even in a random text, just much less frequently compared to a natural plaintext.  Here as you said, whenever the preceding ciphertext letter is the same, the same plaintext word would encode to the same ciphertext.
(21-07-2025, 03:56 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Was the original entropy case sensitive? The text should be converted to uppercase or lowercase before the computation.

After converting all letters to capital letters in the original text, the entropy value is 3.9962.

I meant that a word can only be identical if the last letter at the end of the previous word is also identical. This property should therefore theoretically also be found in the VMS.
(21-07-2025, 04:19 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.After converting all letters to capital letters in the original text, the entropy value is 3.9962.

Also the increase from 3.9962 to 4.4779 is not minor, it's almost as huge as it can get. Maximum possible character entropy for an alphabet of 23 characters is log2(23) ~ 4.5236. The value of 4.5236 would mean the same probability of any character coming after any other character (edit: it's single character probabilities, the wording suggested conditional probabilities), it cannot get more random than that. 4.4779 is just ~0.05 bits from the absolute limit. To see the difference it would be better to talk about the redundancy (if I remember correctly what it's called), which for a 23 character alphabet is 4.5236 - entropy. The redundancy of plaintext Latin is 4.5236 - 3.9962 = 0.5274 bits per character. After the encryption it's 4.5236 - 4.4779 = 0.0457, a decrease of more than an order of magnitude.

(21-07-2025, 04:19 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I meant that a word can only be identical if the last letter at the end of the previous word is also identical.

I don't think it's only if the last letter is identical, two different words can also produce the same ciphertext starting from two different ciphertext letters if their letter values are the same up to some constant shift. But this situation probably will be much rarer than just a repeated plaintext word.

(21-07-2025, 04:19 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.This property should therefore theoretically also be found in the VMS.

I'm not sure how to evaluate this encoding in the context of the Voynich Manuscript. For starters, the character frequency statistics in the manuscript are very wrong for this method. This method produces a text where all characters appear with roughly the same probability, there are no very frequent or very rare characters. So, for this encoding to work (edit: for the Voynichese), the basic element should be some combination of glyphs and not a single glyph.
(21-07-2025, 04:19 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I meant that a word can only be identical if the last letter at the end of the previous word is also identical. This property should therefore theoretically also be found in the VMS.

This is much less common in VMS than a random last letter in the word before the hit ( more than once ) Sad
[attachment=11057]

It should also be noted that by far the most common last letter in the previous word is a “y”.
(21-07-2025, 04:51 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Also the increase from 3.9962 to 4.4779 is not minor, it's almost as huge as it can get. Maximum possible character entropy for an alphabet of 23 characters is log2(23) ~ 4.5236. The value of 4.5236 would mean the same probability of any character coming after any other character, it cannot get more random than that. 4.4779 is just ~0.05 bits from the absolute limit.

I presume that these values from bi3mw represent the single character entropy, not the conditional character entropy. In any case, it is correct that this change is significant.
Although handy, I think we should really make an effort to not hold character entropy as the sole, go-to metric for the VMS. With verbose encoding, conditional character entropy can be driven (almost arbitrarily?) down to VMS levels and n-gram frequencies plots skewed. And that's not even accounting for null characters that can skew and mask stats even more.

Here are some rough stats with bim3w's ciphertext and a verbose (reversible) remapping of characters to keyed numerals, like "A" → "V1", "B" → "V2", etc. I'm disregarding all word and linebreaks and remapping n to i for demonstration purposes.

[attachment=11059]
(21-07-2025, 11:27 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I presume that these values from bi3mw represent the single character entropy, not the conditional character entropy.

Yes, I think my explanation is confusing, it's just about the probabilities of any character appearing regardless of the context, I'll update the post.
Pages: 1 2 3