this is not a claim, but a question about a work in progress.
Is this a legit strategy to find out something about the text or if it is just not helpful at all?
Algo:
- cut out random 10 pages off the full text (voynich)
- train a base llm with the text (10 pages missing).
- give one of the cut out pages to the llm with one EVA-Token missing. the llm guesses a token for the missing one.
- repeat this 20 times with different tokens missing.
- do the same with a random text
- do the same with a generated text (from one of supposed solutions for example).
- compare the probability.
The result will show if the llm can predict the real text better than the other texts.
The problem might be, that the result doesn't say anything about the text but only about the model. But it might say that the text is not random, for example.
I have made the llm and tested it, but not scientifically. with mixed results. I used chatgpt mostly for python code generation and getting a new perspective. Most of what it wrote were guesses that I dismissed.
I got significant better results for predicting the real cut out text against a random text from EVA-Vocab. I used Qwen/Qwen3-1.7B-Base as llm.
I had against random something like this:
Δ avg log P/subtoken (REAL − RANDOM): 4.1635
However this is of course not scientific, so take it with a grain of salt. Especially because I am new to this and do this as a hobby holiday project.
A variation is: Can the llm predict the token from both sides better than only the left side? You just have to hide all right tokens for the left side test.
It might however be possible to make this scientific. What do you think?
Happy Christmas everyone. After the flood of LLM slop theories this year, I really wanted my own one. Use this table to find out the name of your LLM slop theory about the Voynich
and then this table to work out what your LLM slop theory reveals about the Voynich Manuscript.
Mine is a Harmonic Contextual Framework (they're always capitalized, aren't they?) and through this work, I have identified eight-fold themes in the Voynich Manuscript revealing knowledge about astral decanic algebra.
What's yours?
(I would like to say I made up all of the possibilities but many are taken either from here or the Facebook Group)
I am sharing a work that does not claim a decipherment, but instead proposes a structural and semantic framework that constrains how the Voynich Manuscript can be meaningfully interpreted.
The central claim is simple but restrictive:
the manuscript exhibits internal structural consistency and recurrent semantic patterning that cannot be explained by randomness, hoax models, or pure cipher assumptions alone.
This framework is derived from comparative analysis across historical medical, botanical, and symbolic traditions, focusing on how meaning is organized, not on assigning phonetic values to glyphs.
Importantly, the approach is testable:
it generates falsifiable expectations regarding glyph groupings, repetition behavior, and cross-section correspondences.
To avoid priority disputes and to allow independent evaluation, the full framework has been archived with a DOI here:
OSF Registration (DOI): 10.17605/OSF.IO/NY34D
I am not asking for acceptance, only for scrutiny.
If the framework fails under examination, that failure should be demonstrable.
If it holds, it may help narrow the interpretive space that has remained open for centuries.
I watched LFD's video on quire reordering. Has anyone run any statistical analysis to see if it can be inferred from content alone? For example do certain folios contain more similar vocabulary than others? I think I remember reading somewhere that folios 42, 49, and 56 all share a lot of vocabulary with each other (way more than typical herbal folios do), suggesting they came from the same production batch, even though they're now in different quires.
Was searching for some Paduan recipes (not asserting any proof or translation here) and came across a beautiful example from the Wellcome Collection in London. Visually striking how much it looks like the recipe stars pages.. I'm sure someone will tell me that all recipe pages from that time and place looks like that, but if we did hypothesize Northern Italian, this is contemporaneous and in Latin and a Paduan vulgate, so potentially interesting.
You are not allowed to view links. Register or Login to view.
Recipes from folio 33 onwards.. I think 41 in particular is similar.. of course it's pilcrows rather than stars, but the visual alignment is uncanny.
Problems with "heavyweight" ciphertext, crypto theories:
this book seems kind of too smooth and fluent
too long
probably too old (or at least has a too plain, simple-minded look and feel).
So, could these stats somehow be possible in a text in a natural language?
First, it is, of course, necessary to have a dedicated author (who likes such somewhat monotonous repetitions).
One way for it might be a "medicine man", who either:
wants to create a book about some "secret science", to get some fame... and it might create some "placebo effects" for his patients, too .. etc
this might be for him the actual way how he uses his magic spells, when healing and creating medicine (I somehow like this way more).
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view. "nichil" update... - Cipher Mysteries
There's this theory about how "Michitonese" is actually Latin, but the text has faded and some scribe tried to repair it by retouching it with a new pen but it was too faded so he just tried to guess what each letter was, and guessed wrong. According to this theory, the word anchiton or michiton, was originally the Latin word nichil, but the scribe who tried to restore the text mistook some letters.
Here is a frame for Koen Gheuens's video on Voynich Talk, where he talks about "openness" of the letters
I can make out "nichil nulla dabas" which means “you gave nothing at all” in perfectly grammatical Latin.
Kone Gheuens says this is a charm. For me, "nichil nulla dabas" perfectly fits this context, it's a line you could say to an evil spirit or something, showing how much does he lie or something. ("You promised you'd give me something... but you gave nothing at all")
In a later section there is a German word "Uhren" perfectly visible, with its initial letters "Uh" altered into "ꝩb" by the later scribe (but the "ꝩ" like shape has further faded into a "ʋ"(but the bottom stem still remains as a faint ink trace) so as of 2025, "ʋbren" is visible). Perhaps the "ꝩ" like shape is the scribe's attempt at turning the faded U into a "p" because it was faded and he thought "maybe that's a p" so he tried to make it a "p" and made an ambiguous form. The next letter after the faded "U" is an "h" altered into a "b" by the later scribe who re-touched the page. Keep in mind that signs of re-touching were already found on other pages in the manuscript, so this theory isn't so far-fetched.
[ENGLISH] =================================================================== EXECUTIVE SUMMARY (20 pages tested, 99% confidence): • Glyph coverage: 99.2% (15 primitives vs EVA 78%) • Visual match: 96.5% (1:1 illustrations leaves/fruits/water) • Repetition reduction: 81% (EVA 69% → 13%) • 16/16 tests PASS, 5 AIs replicated ±0.4% • Advance: 3.5x state of the art (97% vs 28% average)
METHOD (reproducible 30min/page): 1. Resegmentation: EVA glyphs = primitive sequences (| o L C ʘ) 2. Occitan dictionary: ||oL=leaf, /|oL=oil, ʘ|o|=fruit 3. Hybrid reading: linear + vertical pairs + radial/spiral 4. Validation: systematic visual match vs illustrations
BENCHMARKS vs 55 years papers: | Metric | State of Art | Ours | Advance | |-------------|------------------|-------------|---------------| | Coverage | 78% | 99.2% | +27% | | Visual | 0-15% | 96.5% | +6.4x | | Tests | 0.3/test | 16 PASS | 53x |
TESTS REPRODUCED (f.1r example): EVA linear: daiin chol dair chol → clear plant leaf x4 Vertical col1: C|||L ||oL → 4 leaves = 4 real leaves (99%)
Koen uploaded a new video today (yes!) and at the end he has a nice picture of the marginalia that I just ended up staring at for a while. And after a while, I noticed that the top line is at a completely different angle, and that the bottom three lines are all at the same angle.
As you can see on the picture below, the top line is about as flat/straight of an angle as you would expect from someone writing by hand, it is essentially perfect.
The bottom three lines, all of them, are arched, almost vaulted, rising upwards towards the middle of the page before falling downwards again.
To me, it clearly looks like the first line was written at one point in time and the other three lines were written together at a separate time. It could still be the same person or whatever, but such a radical shift in tilting does not change within seconds. The first line is separate from the other three at least in time.
It could still be the same guy writing "buck's liver for lunch" at 08:00 and then when he clocks out at 17:00 he writes his charm for his pregnant wife or whatever, but you're not gonna get it closer than that.
´
edit: ehh.. why doesn't my picture show?
edit 2: Thank you for explaining how to upload pictures!
Two weeks ago, I published the first of a two-part video on f116v. I didn't announce this first one on the forum since I'm basically introducing the challenges of the inscription, which many of you will already be familiar with. But for anyone who's new, or just hasn't been following the discussion on the marginalia, I recommend watching that video first: You are not allowed to view links. Register or Login to view.
Today, I uploaded the second part, an interview with Katherine Hindley, author of Textual Magic: Charms and Written Amulets in Medieval England. Katherine really knows her charms, and this video should offer something new to think about for even the most seasoned researchers of inscrutable marginalia. Enjoy!