The Voynich Ninja - Strong evidence of a structured four-phase system in the Voynich Manuscript

Pages: 1 2 3 4 5 6 7 8 9

(24-04-2025, 12:04 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 11:54 AM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.The table you shared beautifully illustrates this: the 1405 seed yields coherent predictions; 2025 doesn’t.

If I understand it correctly, you tried various splits and various seeds and the 4-way split with seed 1405 showed the best results? I'm not in academia, so I'm not sure if this is methodologically sound. I mean, given only a limited number of folios there bound to be a combination where there will be a pattern.

Sorry, but that reasoning confuses possibility with statistical significance.

Yes, any dataset can produce some pattern under some split — but the key question is: how often do those patterns arise under random conditions?

We tested that. And the answer is: not often at all. When we randomize the data (by permuting sequences, reordering folios, or using other seeds), the signal disappears. That’s the whole point of running permutation tests — to estimate the likelihood of false positives due to overfitting or structural coincidence. Do you know what a permutation test, an ablation test, and the others mean?

So yes, in theory, you could find some pattern in any dataset. But if only one configuration produces a meaningful result, and none of the others do, that result isn’t just a fluke — it’s statistically validated.

That’s what statistical testing is for.

(24-04-2025, 12:08 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 11:54 AM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.Because any clustering will always find some internal order — that’s just how unsupervised learning works. But the real test is whether it consistently aligns with an external structure (in our case, the lunar phases we hypothesize). That only happens at seed 1405.

I understand a little better, thanks. 1405 is special.

I was confused because you wrote earlier:
"The seed 1405 is arbitrary and used solely to ensure full reproducibility. Any other fixed number would work — this one was chosen to match the year 1405, which aligns with the possible calendar framework used in the analysis"

Ah, I see — no worries!

You can use any fixed number — the point of setting a seed is to lock the randomness for reproducibility. What matters is that only when randomness is closed at 1405, the semantic structure emerges. When we lock it at other values, or fully shuffle the data, the pattern disappears.

There is a huge number of possible permutations of folios (factorial of the number of folios). Finding a seed that would result in a specific pattern by chance (or even billions of tests) is extraordinary. Are these "patterns that don’t happen under any other configuration" really unique and better for 1405 than any other seed? This is unclear to me.

(I wrote this before reading your last post, so okay, "it’s statistically validated.")

The conceptual problem I have is that the number generator seed has nothing to do with anything physical, like a date. It's not as if the lunar phases were generated with accurate astronomic calculation for a range of dates, they are pseudo-random, the random generation sequence was made deterministic by the value of the seed. So 1405 being special feels meaningless even if it works better than any other seed.

I should wait for the article on arXiv, I hope it will explain everything that I don't understand.

(24-04-2025, 12:34 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.There is a huge number of possible permutations of folios (factorial of the number of folios). Finding a seed that would result in a specific pattern by chance (or even billions of test) is extraordinary.

If I understand it correctly, it's not looking for a specific pattern, it's looking for any suitable cyclic pattern, the number of which is only marginally smaller than the total number of folio permutations.

Also, given the (pseudo-)random way the seed affects the results, the fact that the magic seed in question is 1405 (and not, say, 3452435431), also speaks somewhat about the actual number of permutations tested.

Maybe there is more in the article, but so far personally to me this doesn't look very promising.

Anyway, thanks to Urtx13 for sharing!

(24-04-2025, 12:34 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.There is a huge number of possible permutations of folios (factorial of the number of folios). Finding a seed that would result in a specific pattern by chance (or even billions of tests) is extraordinary. Are these "patterns that don’t happen under any other configuration" really unique and better for 1405 than any other seed? This is unclear to me.

(I wrote this before reading your last post, so okay, "it’s statistically validated.")

The conceptual problem I have is that the number generator seed has nothing to do with anything physical, like a date. It's not as if the lunar phases were generated with accurate astronomic calculation for a range of dates, they are pseudo-random, the random generation sequence was made deterministic by the value of the seed. So 1405 being special feels meaningless even if it works better than any other seed.

I should wait for the article on arXiv, I hope it will explain everything that I don't understand.

Do you know what a seed is?

A seed is just a number that lets you start the pseudo-random process from the same place every time. Without it, you’d get different results every run, like someone running 1 km, but always starting from a different city.

When you set a seed, it’s like saying:
“Let’s always start the 1 km run from Barcelona.”
Now you can compare runs fairly. You have a fixed reference point. It is just that.

(24-04-2025, 12:44 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 12:34 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.There is a huge number of possible permutations of folios (factorial of the number of folios). Finding a seed that would result in a specific pattern by chance (or even billions of test) is extraordinary.

If I understand it correctly, it's not looking for a specific pattern, it's looking for any suitable cyclic pattern, the number of which is only marginally smaller than the total number of folio permutations.

Also, given the (pseudo-)random way the seed affects the results, the fact that the magic seed in question is 1405 (and not, say, 3452435431), also speaks somewhat about the actual number of permutations tested.

Maybe there is more in the article, but so far personally to me this doesn't look very promising.

Anyway, thanks to Urtx13 for sharing!

To clarify: the analysis doesn’t “search for a pattern” in a combinatorial sense. The pipeline is fixed — it uses EVA tokens, entropy per folio, fixed lunar-phase mapping (based on a symbolic seed). Then it tests whether a four-phase structure emerges consistently across folios and survives statistical validation (classification, autocorrelation, permutation test, etc.).

(24-04-2025, 01:00 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.We’re not saying 1405 is “magic.” We’re saying:

When the system is configured that way, a strong structure appears. When you break that configuration, it disappears.
[font=.AppleSystemUIFont]That’s not about the seed — that’s about the fragility and specificity of the outcome.[/font]

That’s what makes it interesting.

I understand your perspective, but I think it takes some mathematical rigor and some understanding of the manuscript as a physical object to properly validate the specificity and significance of the outcome. I hope it's all in the paper, let's just wait for the publication. For now I can only make an estimation based on circumstantial evidence, and so far it doesn't look very convincing to me. Again, thanks for sharing!

(24-04-2025, 01:27 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 01:00 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.We’re not saying 1405 is “magic.” We’re saying:

When the system is configured that way, a strong structure appears. When you break that configuration, it disappears.
[font=.AppleSystemUIFont]That’s not about the seed — that’s about the fragility and specificity of the outcome.[/font]

That’s what makes it interesting.

I understand your perspective, but I think it takes some mathematical rigor and some understanding of the manuscript as a physical object to properly validate the specificity and significance of the outcome. I hope it's all in the paper, let's just wait for the publication. For now I can only make an estimation based on circumstantial evidence, and so far it doesn't look very convincing to me. Again, thanks for sharing!

Gosh... That is not subjective. That is data. And that is precisely why I ran the tests: to ensure that any claim is falsifiable and measurable, not intuitive. Data doesn't care whether it "feels right", it either holds or it breaks. That is why the tests were designed the way they were... You are going to find the same on Arxiv lol

(24-04-2025, 01:32 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 01:27 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(24-04-2025, 01:00 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.We’re not saying 1405 is “magic.” We’re saying:

When the system is configured that way, a strong structure appears. When you break that configuration, it disappears.
[font=.AppleSystemUIFont]That’s not about the seed — that’s about the fragility and specificity of the outcome.[/font]

That’s what makes it interesting.

I understand your perspective, but I think it takes some mathematical rigor and some understanding of the manuscript as a physical object to properly validate the specificity and significance of the outcome. I hope it's all in the paper, let's just wait for the publication. For now I can only make an estimation based on circumstantial evidence, and so far it doesn't look very convincing to me. Again, thanks for sharing!

Gosh... That is not subjective. That is data. And that is precisely why I ran the tests: to ensure that any claim is falsifiable and measurable, not intuitive. Data doesn't care whether it "feels right", it either holds or it breaks. That is why the tests were designed the way they were... You are going to find the same on Arxiv lol

Mathematical rigor? The analysis includes entropy metrics, LDA modeling, supervised classification, permutation tests, ablation studies, cross-checking...

A random number generator (RNG) creates an arbitrary sequence of numbers.

When two people run an experiment that includes a random number generator, they will not get exactly the same results. When the process that is being experimented is well-behaved (not chaotic), the results will be similar but not the same.

If one wants to verify an experiment on a process, then one wants to make sure that two people get exactly the same results. This can be done in case both people set the same 'seed' (starting point) for the RNG. (Note that there are also RNG's for which this will not work).

So people here are invited to use the seed 1405 in order to get exactly the same result. Using other seeds can be interesting in order to check that it creates similar results.

(24-04-2025, 01:00 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.Do you know what a seed is?

A seed is just a number that lets you start the pseudo-random process from the same place every time. Without it, you’d get different results every run, like someone running 1 km, but always starting from a different city.

Yes, absolutely clear. I'm a software engineer. Smile

I've been doing this in my programs for decades, setting the pseudo-random generator seed to be able to get the same result in a later run of course.

My problem is conceptual/philosophical: let me try to explain it.

If any calculation based on a sequence of random numbers produces something that compares amazingly well with reality only for this sequence of random numbers, it is a property of this sequence of random numbers, not a property of reality.

It's like discovering the entire US Declaration of Independence coded in the decimals of pi at offset 14287468794577131. Yes, it's there somewhere, but what does the discovery mean? Absolutely nothing.

(24-04-2025, 01:38 PM)Urtx13 Wrote: You are not allowed to view links. Register or Login to view.Mathematical rigor? The analysis includes entropy metrics, LDA modeling, supervised classification, permutation tests, ablation studies, cross-checking...

Actually, the assortment of tools is one of the things that makes the whole pipeline hard to evaluate for me. From my point of view, mathematical rigor implies using the minimum number of tools and transformations to clearly validate your hypothesis. The fact that there was a need to stack a number of models and analytical tools on top of each other only makes it harder to identify potential problems.

I wonder if cyclical nature of the folios was detected in the past research on topic modeling in the Voynich Manuscript? E.g., You are not allowed to view links. Register or Login to view.

Pages: 1 2 3 4 5 6 7 8 9