![]() |
Token-level rules applied to EVA transcripts – reproducible code example - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html) +---- Forum: ChatGPTrash (https://www.voynich.ninja/forum-59.html) +---- Thread: Token-level rules applied to EVA transcripts – reproducible code example (/thread-4946.html) Pages:
1
2
|
Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 Hi all, I’m still learning how best to present this work here. I know the forum has seen plenty of “AI slop,” so I want to make clear up front: this is not an AI translation. What I’m sharing below is a small demo showing why naïve code breaks completely on Voynich EVA text, and how a very simple rule-based parser (prefix/suffix/infix checks) produces consistent partial results across EVA lines. It’s not perfect — many tokens still come out as “[?]” — but that’s part of the point: it’s mechanical and testable, not free-form invention. My goal is to invite feedback on whether this kind of structured, token-level approach looks like a credible path forward, and if so, how to make it stronger. 1) Naïve approach (fails) # Naïve dictionary: expects exact token matches → fails on real EVA strings rules = { "chedy": "herb", "qokchdy": "root extract", "ody": "base matter", "she": "fire/calcination", "dol": "water cycle", "oram": "joint/limb", } eva_line = "ychedy shetshdy qotar okedy qokal saiin ol karar odeeed" decoded = [rules.get(tok, "[?]") for tok in eva_line.split()] print(" ".join(decoded)) Expected output [?] [?] [?] [?] [?] [?] [?] [?] [?] Why it breaks: EVA tokens are variable (prefixes, suffixes, infixes). Exact-match lookup doesn’t work. 2) Rule-based parsing (prefix/suffix/infix) # Minimal, reproducible rule-based decoder using prefix/suffix/infix tests def decode_token(t): # suffix rules if t.endswith("ody"): return "base matter" if t.endswith("ram"): return "joint/limb" if t.endswith("dy") and t.startswith("qokc"): return "root extract" # qokchdy / qokchedy variants # prefix rules if t.startswith("che"): return "herb/plant" if t.startswith("she"): return "fire/calcination" if t.startswith("oked"): return "preparation/infusion" if t.startswith("qok"): return "boil/infuse (qok- class)" if t.startswith("kar"): return "vessel/container" # infix rule if "dol" in t or "qodal" in t: return "water/cycle/liquid" # bridging/repetition token often seen if t == "saiin": return "again/repeat" return "[?]" eva_line = "ychedy shetshdy qotar okedy qokal saiin ol karar odeeed" decoded = [decode_token(tok) for tok in eva_line.split()] print(" ".join(decoded)) Expected output (example) herb/plant fire/calcination [?] preparation/infusion boil/infuse (qok- class) again/repeat [?] vessel/container base matter Point: Same text that the naïve code couldn’t read now yields mechanical, rule-driven partial readings—no “AI translation,” just explicit token logic. 3) Cross-folio consistency check (multiple EVA lines) # Two additional EVA lines (from f85r1 examples used above) eva_lines = [ "kchedar yteol okchdy qokedy otor odor or chedy otechdy dal cphedy", "oees aiin olkeeody ors cheey qokchdy qotol okar otar otchy dkam", ] for i, line in enumerate(eva_lines, 1): decoded = [decode_token(tok) for tok in line.split()] print(f"Line {i}:", line) print("Decoded :", " | ".join(decoded), "\n") Expected Output (example) Line 1: kchedar yteol okchdy qokedy otor odor or chedy otechdy dal cphedy Decoded : herb/plant | [?] | root extract | boil/infuse (qok- class) | [?] | [?] | [?] | herb/plant | preparation/infusion | [?] | herb/plant Line 2: oees aiin olkeeody ors cheey qokchdy qotol okar otar otchy dkam Decoded : [?] | [?] | base matter | [?] | herb/plant | root extract | [?] | vessel/container | [?] | [?] | [?] Points this demonstrates: • Consistency: tokens like chedy → herb/plant, qokchdy → root extract, …ody → base matter are read the same way across lines. • Reproducibility: anyone can run this and see the same partial outputs. • Non-hallucinatory: when no rule matches, the code says “[?]”, instead of inventing prose. I know this is only a partial framework — there are still many unsolved tokens. That’s intentional, since I don’t want to overfit or make guesses where the rules don’t yet apply. If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. I’m aiming for something reproducible and mechanical, not “mystical translation" Best Regards, Francis RE: Token-level rules applied to EVA transcripts – reproducible code example - oshfdk - 24-09-2025 (24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. Hi! The below is an image containing various labels that appear next to images in the MS. Using your rules could you explain the meaning of the labels and how they relate to the images. RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 Top image, top left side, ... qokedy chedy qokchdy okedy ... Let me break it down carefully: qokedy → qok- (boil/infuse) + -edy (preparation) Expansion: boiled preparation chedy → che- (herb/plant) Expansion: herb/plant element qokchdy → qok- (boil/infuse) + -chdy (root extract) Expansion: infused root extract okedy → ok- (infuse) + -edy (prepared) Expansion: prepared infusion Plain English Rendering: “Boiled preparation – plant/herb – infused root extract – prepared infusion.” Fit with Imagery:
RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. Second image down to the right, Voynich Labels (EVA transcription): olal qotedy olchedy qokeody Rule-Based Decoding: olal → ol- (vessel/container) + -al (cycle/liquid marker) → liquid vessel / containing channel. qotedy → qo- (process) + t/ot (infuse/prepare) + -edy (preparation) → prepared infusion. olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → vessel of plant material. qokeody → qok- (boil/infuse) + -dy (substance) → infused extract / boiled substance. Plain English rendering of the sequence: “Vessel/containment – prepared infusion – herbal vessel – infused extract.” Why it fits the imagery: Each woman is holding a star (marker of process/stage). The labels read like a short recipe cycle: starting with a vessel, moving to a prepared infusion, then a vessel with herbs, and ending in an infused extract. This isn’t a full translation — it’s a step-by-step application of substitution rules, leaving gaps where the rules don’t reach yet. But it shows how the labels and imagery reinforce each other in a structured way. (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. Third image, second row right side, Voynich Labels (EVA transcription, outer ring, clockwise): olchedy olkeeody olchedy olkeeody olkeeody Rule-Based Decoding (outer ring): olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → vessel of plant material. olkeeody → ol- (vessel/container) + kee- (cycle/liquid variant) + -ody (base matter) → vessel of liquid matter / cycle container. So the outer ring repeats variations of “herbal vessel” and “liquid vessel.” Inner Band (longer text, partial transcription): dair olkeeody kaiin dain otol … olkeeody again = vessel/container of liquid/base matter. dain (common Voynichese word, appears often as marker) = cycle/again. otol contains ol = vessel/container, possibly a variant on “contained liquid.” Expansion (illustrative): “Cycle of vessels/liquid containers repeated again …” Plain English Rendering (overall): “Vessel with herbs – vessel with liquid – vessel with herbs – vessel with liquid … cycle of vessels repeated.” Why it fits the imagery: The circular layout itself visually emphasizes repetition and cycling of containers. The text echoes that: alternating “herbal vessel” and “liquid vessel” around the ring, with the inner band reinforcing the idea of cycles/again. It’s a diagrammatic recipe of alternating vessels rather than a narrative sentence. RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. 4 picture entire 3rd row. Voynich Labels (EVA transcription, left → right): olkarar olchedy olkeeody olkeeody olchedy olkeeody olkarar olkeeody olkarar olkarar olkeeody Rule-Based Decoding: olkarar → ol- (vessel/container) + kar- (vessel/tub) → large vessel / tub. olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → vessel of plant material. olkeeody → ol- (vessel/container) + kee- (liquid/cycle variant) + -ody (base matter) → liquid vessel / container of matter. Plain English Rendering (sequence): “Tub – herbal vessel – liquid vessel – liquid vessel – herbal vessel – liquid vessel – tub – liquid vessel – tub – tub – liquid vessel.” Why it fits the imagery: The women are all drawn immersed in a large communal green bath, segmented under canopy arches. The repeating alternation of “herbal vessel” and “liquid vessel” makes sense in this context: each section of the bath is labeled as containing either plant matter or liquid base. olkarar (tub) appears consistently where figures are shown in larger enclosed areas. The repetition reinforces that this is a diagrammatic schema of alternating vessels, not random words. (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. 5th picture, 4 entire row, Voynich Labels (EVA transcription, left → right): othedy (left figure with spout) olkar (center seated figure) daiin otol (written near the rainbow arc) olchedy (rightmost) Rule-Based Decoding: othedy → prefix ot- (infuse/prepare) + he- (flow/spout marker) + -dy (substance) → flowing infusion / spouted liquid. olkar → ol- (vessel/container) + kar- (tub/container) → vessel or tub. daiin otol → daiin (again/cycle) + otol (infused vessel, liquid container) → cycle of liquid vessel. olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → herbal vessel. Plain English Rendering (sequence): “Spouted infusion – vessel/tub – cycle of liquid vessel – herbal vessel.” Why it fits the imagery: Left figure is literally manipulating a spout → ties to flowing infusion. Center seated figure on a mound/tub → matches vessel/tub. Rainbow arc → labeled cycle of liquid vessel → visually matches cyclical flow. Right figure → marked as herbal vessel → consistent with other herbal bath motifs. RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. 6 image, 5th row left side, Voynich Labels (EVA transcription, left → right, along the arch and central text): olchedy (left woman in red tub) olkeeody (on the arch) qokedy qokchdy (within the central text line) daiin otol (repeated in the arch and central band) olkarar (right woman’s vessel) Rule-Based Decoding: olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → herbal vessel. olkeeody → ol- (vessel/container) + kee- (liquid/cycle variant) + -ody (base matter) → liquid vessel / container of matter. qokedy → qok- (boil/infuse) + -edy (prepared) → boiled preparation. qokchdy → qok- (boil/infuse) + -chdy (root extract) → infused root extract. daiin otol → daiin (again/cycle) + otol (infused vessel/liquid container) → cycle of liquid vessels. olkarar → ol- (vessel/container) + kar- (tub/container) → tub/container. Plain English Rendering (sequence): “Herbal vessel – liquid vessel – boiled preparation – infused root extract – cycle of liquid vessels – tub/container.” Why it fits the imagery: The left woman is immersed in a red tub, labeled “herbal vessel,” which matches the plant-based bath context. The arch of water is labeled repeatedly with “liquid vessel” and “cycle of vessels,” visually reinforcing the flowing cycle. The central text adds detail: boiling and infusion steps, consistent with the process imagery. The right woman’s tub is marked “tub/container,” grounding the final stage. (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. 6th picture, 5 row left side, Voynich Labels (EVA transcription, left → right, along the arch and central text): olchedy (left woman in red tub) olkeeody (on the arch) qokedy qokchdy (within the central text line) daiin otol (repeated in the arch and central band) olkarar (right woman’s vessel) Rule-Based Decoding: olchedy → ol- (vessel/container) + che- (herb/plant) + -dy (unit) → herbal vessel. olkeeody → ol- (vessel/container) + kee- (liquid/cycle variant) + -ody (base matter) → liquid vessel / container of matter. qokedy → qok- (boil/infuse) + -edy (prepared) → boiled preparation. qokchdy → qok- (boil/infuse) + -chdy (root extract) → infused root extract. daiin otol → daiin (again/cycle) + otol (infused vessel/liquid container) → cycle of liquid vessels. olkarar → ol- (vessel/container) + kar- (tub/container) → tub/container. Plain English Rendering (sequence): “Herbal vessel – liquid vessel – boiled preparation – infused root extract – cycle of liquid vessels – tub/container.” Why it fits the imagery: The left woman is immersed in a red tub, labeled “herbal vessel,” which matches the plant-based bath context. The arch of water is labeled repeatedly with “liquid vessel” and “cycle of vessels,” visually reinforcing the flowing cycle. The central text adds detail: boiling and infusion steps, consistent with the process imagery. The right woman’s tub is marked “tub/container,” grounding the final stage. RE: Token-level rules applied to EVA transcripts – reproducible code example - oshfdk - 24-09-2025 (24-09-2025, 07:48 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.Top image, top left side, ... qokedy chedy qokchdy okedy ... Thank you for your time, I've seen enough. RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 (24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. 7th pic 5th row right side "last picture here" Voynich Labels (EVA transcription, top and along the arch, left → right):
“Herbal vessel – liquid vessel – boiled preparation – infused root extract – cycle of liquid vessels – tub/container.” Why it fits the imagery:
RE: Token-level rules applied to EVA transcripts – reproducible code example - fran9262 - 24-09-2025 (24-09-2025, 08:02 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:31 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(24-09-2025, 07:15 PM)fran9262 Wrote: You are not allowed to view links. Register or Login to view.If you see flaws in the rules, or if you think better tests would expose the weaknesses (or strengths) of this approach, I’d really like to hear it. Thank you for taking the time to look at this — I appreciate the opportunity to have the method tested against real examples. For anyone else following along, I’d welcome thoughts specifically on the code framework (prefix/suffix parsing, consistency across labels, substitution steps). My aim is to make this reproducible, not just a list of claims, so critique on where the rules break or don’t hold up is exactly what I’m looking for. RE: Token-level rules applied to EVA transcripts – reproducible code example - Mauro - 24-09-2025 If I understood, your proposal is that the VMS is a kind of You are not allowed to view links. Register or Login to view.. Which it might well be! But the problem of your approach (and of innumerable others) to the 'translation' is that it's way too easy to assign some arbitrary meaning to some tokens and then build a whole theory upon that. In the last few months we have had here at least a dozen different solutions, going from slavic languages to celtic languages to musical notation to alchemical procedures. All of them were internally 'coherent', but did any of them hit the mark? No: they were all speculations without evidence (albeit coherent speculations). |