The Voynich Ninja

Pages: 1 2 3 4 5 6 7 8

(14-08-2025, 12:25 AM)davidma Wrote: You are not allowed to view links. Register or Login to view.As i recall from your post quimqu no other cipher had that distinctive bump that the VM has, but the naibbe one does.

No, syllabic substitution made the trick of the bump also.

Addendum to You are not allowed to view links. Register or Login to view.:

You can also create a heat map in which the labels are displayed in Voynichese. This requires that the font (usually “eva1.ttf”) be installed on your PC. In the following line in the Python script, the path to the font must be adjusted (line 59):

custom_font_path = "/home/me/.local/share/fonts/eva1.ttf"

The call is made as already described, where 25 is the variable number of listed pairs:

python heatmap_prefix_suffix.py parsed.txt 25

Code:
#!/usr/bin/env python3

import sys

from collections import Counter

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

from matplotlib import font_manager as fm

def extract_prefix_suffix(word):

    parts = word.split('/')

    if len(parts) < 3:

        return [], [], None

    stem_index = len(parts) // 2

    prefix = parts[:stem_index]

    suffix = parts[stem_index+1:]

    return prefix, suffix, parts[stem_index]

def main():

    if len(sys.argv) < 2:

        print("Usage: python heatmap_prefix_suffix.py output_segmented.txt [N]")

        sys.exit(1)

    filename = sys.argv[1]

    top_n = 20  # default number of top prefixes and suffixes

    if len(sys.argv) >= 3:

        try:

            top_n = int(sys.argv[2])

        except ValueError:

            print("Parameter N must be an integer.")

            sys.exit(1)

    prefix_suffix_counts = Counter()

    with open(filename, 'r', encoding='utf-8') as f:

        for line in f:

            words = line.strip().split()

            for w in words:

                prefix, suffix, stem = extract_prefix_suffix(w)

                for pre in prefix:

                    for suf in suffix:

                        prefix_suffix_counts[(pre, suf)] += 1

    if not prefix_suffix_counts:

        print("No prefix-suffix combinations found.")

        sys.exit(1)

    data = [{"prefix": pre, "suffix": suf, "count": count}

            for (pre, suf), count in prefix_suffix_counts.items()]

    df = pd.DataFrame(data)

    # Select top N prefixes and suffixes by total count

    top_prefixes = df.groupby('prefix')['count'].sum().nlargest(top_n).index

    top_suffixes = df.groupby('suffix')['count'].sum().nlargest(top_n).index

    pivot = df.pivot(index='prefix', columns='suffix', values='count').fillna(0)

    pivot_top = pivot.loc[top_prefixes, top_suffixes]

    # Lade lokalen Font

    custom_font_path = "/home/me/.local/share/fonts/eva1.ttf"

    custom_font = fm.FontProperties(fname=custom_font_path, size=12)

    # Plot

    plt.figure(figsize=(18, 12))

    ax = sns.heatmap(

        pivot_top,

        annot=True,

        fmt=".0f",

        cmap="YlGnBu",

        cbar_kws={"shrink": 0.5}

    )

    plt.title(f"Top {top_n} Prefix-Suffix Combinations")

    plt.xlabel("Suffix")

    plt.ylabel("Prefix")

    # Achsenbeschriftungen (Tick-Labels) mit lokalem Font

    ax.set_xticklabels(ax.get_xticklabels(), fontproperties=custom_font, rotation=45, ha='right')

    ax.set_yticklabels(ax.get_yticklabels(), fontproperties=custom_font, rotation=0)

    plt.subplots_adjust(bottom=0.2)

    plt.show()

if __name__ == "__main__":

    main()

[attachment=11259]
[attachment=11260]

Here are the text files for parsing:
[attachment=11265]
[attachment=11266]

You are not allowed to view links. Register or Login to view.

qokchy tched qokeody alar qokaiin shedy qolcheody shg ain

GRATIASTIBIAGO

(05-09-2025, 07:52 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.You are not allowed to view links. Register or Login to view.

qokchy tched qokeody alar qokaiin shedy qolcheody shg ain

GRATIASTIBIAGO

We have a winner!

About the scripts posted by Magnesium in the other thread: You are not allowed to view links. Register or Login to view.

I tried naibbe.py and decrypt_naibbe.py on a Latin text (1000 short lines) with default settings: no problem. Yes

Despite this setting in naibbe.py:
UNAMBIGUOUS=True #True means bigram token generation avoids accidentally creating unigram word types. Strongly recommend True

I had an ambiguity at decryption every few lines, for example: ep(ma|it)aphium

20250724 Naibbe Cipher Paper Wrote:2.7 Minimizing ambiguities during encryption and decryption
We can also preemptively eliminate ambiguity at the encryption stage. If we happen to encrypt a plaintext bigram as a word type reserved for unigram use, we can simply re-encrypt the bigram (by redrawing cards) so that it no longer matches a unigram word type.

naibbe.py Wrote:Total ambiguity retries due to prefix+suffix collisions with unigrams: 798

No issue, but I'm just wondering: if these retries do not eliminate ambiguities completely, should there be more retries?

After 4 attempts I got a non-ambiguous ciphertext for "epitaphium", so it is possible: chedy dchdy qokaiin yte shy qokedy sheedy qokar.

(05-09-2025, 10:08 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.I tried naibbe.py and decrypt_naibbe.py on a Latin text (1000 short lines) with default settings: no problem.

Despite this setting in naibbe.py:
UNAMBIGUOUS=True #True means bigram token generation avoids accidentally creating unigram word types. Strongly recommend True

I had an ambiguity at decryption every few lines, for example: ep(ma|it)aphium

20250724 Naibbe Cipher Paper Wrote:2.7 Minimizing ambiguities during encryption and decryption
We can also preemptively eliminate ambiguity at the encryption stage. If we happen to encrypt a plaintext bigram as a word type reserved for unigram use, we can simply re-encrypt the bigram (by redrawing cards) so that it no longer matches a unigram word type.

naibbe.py Wrote:Total ambiguity retries due to prefix+suffix collisions with unigrams: 798

No issue, but I'm just wondering: if these retries do not eliminate ambiguities completely, should there be more retries?

After 4 attempts I got a non-ambiguous ciphertext for "epitaphium", so it is possible: chedy dchdy qokaiin yte shy qokedy sheedy qokar.

In principle yes, there should be a loop that checks bigram-bigram ambiguity. I haven’t gotten there yet; I am working on it. As an interim solution I included the ambiguous decryption formatting under the UNAMBIGUOUS condition so that bigram-bigram ambiguity is at least visible.

(05-09-2025, 10:08 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.About the scripts posted by Magnesium in the other thread: You are not allowed to view links. Register or Login to view.

No issue, but I'm just wondering: if these retries do not eliminate ambiguities completely, should there be more retries?

After 4 attempts I got a non-ambiguous ciphertext for "epitaphium", so it is possible: chedy dchdy qokaiin yte shy qokedy sheedy qokar.

I have added "naibbe_v2.py" to the Dropbox folder, along with an associated Jupyter notebook. Its version of the unambiguous loop should take care of bigram-bigram ambiguity without too much drag on performance.

Pages: 1 2 3 4 5 6 7 8

quimqu

bi3mw

RobGea

magnesium

nablator

magnesium

magnesium