The Voynich Ninja

Full Version: Currier A/B split is not what we thought it was!
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
(04-05-2026, 07:12 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.
(04-05-2026, 03:43 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.My bad  Wink thanks for the correctoon

I am going to note that this is not an answer to the question, "Where did you get this (wrong) information from?", which I think some people might want the answer to before a lot of energy is sunk into evaluating this analysis

All my scripts use the accurate A B distribution except the morphometrics one which uses a very early sketchy approximation by mistake, but wait for tomorrows detailed results before investing energy Smile
Is there any jargon that you decided not to use? 

(04-05-2026, 07:12 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.I am going to note that this is not an answer to the question, "Where did you get this (wrong) information from?", which I think some people might want the answer to before a lot of energy is sunk into evaluating this analysis

And the information was not only wrong, but presented as "traditionally", as if it was the consensus. Now that it has been pointed out it has been "fixed" (whatever that means), and the question of the source is being deliberately dodged in favour of waiting for the next reveal of results.

A familiar pattern, although one that's been less common here for the last week or two..
(04-05-2026, 08:05 PM)eggyk Wrote: You are not allowed to view links. Register or Login to view.Is there any jargon that you decided not to use? 

(04-05-2026, 07:12 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.I am going to note that this is not an answer to the question, "Where did you get this (wrong) information from?", which I think some people might want the answer to before a lot of energy is sunk into evaluating this analysis

And the information was not only wrong, but presented as "traditionally", as if it was the consensus. Now that it has been pointed out it has been "fixed" (whatever that means), and the question of the source is being deliberately dodged in favour of waiting for the next reveal of results.

A familiar pattern, although one that's been less common here for the last week or two..

didnt I answer the question of the source? what makes you think its deliberately dodged?
(04-05-2026, 08:30 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.didnt I answer the question of the source? what makes you think its deliberately dodged?

Stating that it was a "sketchy approximation" is not an answer to where that information came from.

You said the folios are traditionally considered as currier A. That's not really information that comes from a mistaken approximation, but rather some other source.
Quote:Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories.
You are not allowed to view links. Register or Login to view.

Nice! I learned something today.

I don't really mind AI-generated text if the research is solid. But if the data or claims are hallucinated, which happens nearly always, it's a waste of time.

Whether you use AI as a tool or AI is using you as a passive copy-pasting tool makes all the difference.
(04-05-2026, 03:09 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Some of you may have seen my earlier work confirming the Currier A/B distinction quantitatively. That paper showed the distinction is real, recoverable without labels, and predictive.

Have you seen You are not allowed to view links. Register or Login to view.?  How would they relate to your analysis?


All the best, --stolfi
(04-05-2026, 09:44 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(04-05-2026, 03:09 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Some of you may have seen my earlier work confirming the Currier A/B distinction quantitatively. That paper showed the distinction is real, recoverable without labels, and predictive.

Have you seen You are not allowed to view links. Register or Login to view.?  How would they relate to your analysis?


All the best, --stolfi

Thank you, it's closely related! And even more Nick Pelling's tables Smile
Isn't Hand 1 Currier A, and Currier B the others? There's also an ED bigram split which Dunsel highlighted, with no ED appearing in Hand 1..
So here is the proposed multilayer mechanism (more details to come in my github repo):

Each word containing ch or sh followed immediately by o or e is mapped to a template by replacing the post-digraph vowel with a placeholder X.
For example, chody and chedy both map to chXdy; shol and shel both map to shXl.
Words containing multiple substitution sites have each site replaced independently.

Each word token contributes one count to its template in the appropriate folio-state column, and is scored as cho or che at each substitution site.
The cho rate for a template in a given state is the number of cho substitution events divided by the total substitution events for that template in that state.

We retain only templates with at least 10 substitution events in both folio states. Of the hundreds of distinct templates in the corpus, 31 meet this criterion.
These 31 templates account for 4,085 substitution events, comprising the large majority of all classifiable tokens.

Here are the templates:

+----------------+--------------+---------+------+---------+------+---------+
| Class          | Template    | Rate₁  | n₁  | Rate₀  | n₀  | |Δ|    |
+----------------+--------------+---------+------+---------+------+---------+
| Fixed cho      | chXl        | 1.000  | 217  | 0.948  | 135  | 0.052  |
| Fixed cho      | shXl        | 1.000  | 102  | 0.953  | 64  | 0.047  |
| Fixed cho      | shXr        | 0.966  | 58  | 0.935  | 31  | 0.030  |
+----------------+--------------+---------+------+---------+------+---------+
| Fixed che      | shXy        | 0.098  | 41  | 0.033  | 304  | 0.065  |
| Fixed che      | chXey        | 0.000  | 26  | 0.000  | 156  | 0.000  |
| Fixed che      | chXol        | 0.031  | 32  | 0.008  | 123  | 0.023  |
| Fixed che      | shXey        | 0.000  | 27  | 0.000  | 108  | 0.000  |
| Fixed che      | shXol        | 0.000  | 23  | 0.000  | 72  | 0.000  |
| Fixed che      | chXor        | 0.075  | 40  | 0.000  | 47  | 0.075  |
| Fixed che      | chXody      | 0.000  | 11  | 0.000  | 63  | 0.000  |
| Fixed che      | chXo        | 0.067  | 15  | 0.000  | 46  | 0.067  |
| Fixed che      | shXo        | 0.000  | 15  | 0.034  | 29  | 0.034  |
| Fixed che      | otchXy      | 0.000  | 12  | 0.032  | 31  | 0.032  |
| Fixed che      | shXor        | 0.000  | 13  | 0.000  | 28  | 0.000  |
| Fixed che      | okchXy      | 0.067  | 15  | 0.000  | 22  | 0.067  |
+----------------+--------------+---------+------+---------+------+---------+
| Switchable    | chXdy        | 0.921  | 38  | 0.116  | 362  | 0.805  |
| Switchable    | shXdy        | 0.870  | 23  | 0.096  | 271  | 0.774  |
| Switchable    | shX          | 0.880  | 92  | 0.458  | 48  | 0.422  |
| Switchable    | chXky        | 0.703  | 37  | 0.210  | 62  | 0.493  |
| Switchable    | chXs        | 0.778  | 18  | 0.387  | 62  | 0.391  |
| Switchable    | chX          | 0.976  | 41  | 0.529  | 34  | 0.446  |
| Switchable    | chXty        | 0.812  | 32  | 0.483  | 29  | 0.330  |
| Switchable    | chXdaiin    | 1.000  | 21  | 0.436  | 39  | 0.564  |
| Switchable    | chXcthy      | 0.917  | 12  | 0.167  | 36  | 0.750  |
| Switchable    | shXky        | 0.500  | 10  | 0.111  | 27  | 0.389  |
| Switchable    | shXdaiin    | 0.917  | 12  | 0.500  | 16  | 0.417  |
+----------------+--------------+---------+------+---------+------+---------+
| Intermediate  | chXy        | 0.213  | 89  | 0.021  | 426  | 0.192  |
| Intermediate  | chXr        | 0.962  | 158  | 0.898  | 49  | 0.064  |
| Intermediate  | chXar        | 0.158  | 19  | 0.043  | 46  | 0.114  |
| Intermediate  | chXal        | 0.182  | 11  | 0.000  | 31  | 0.182  |
| Intermediate  | kchXy        | 0.143  | 14  | 0.000  | 14  | 0.143  |
+----------------+--------------+---------+------+---------+------+---------+

(Rate1 and Rate0 give the cho rate in O and E-folios. |∆| is the absolute rate difference between states.).

Templates are classified into four categories based on their cho rate in each folio state:

Fixed cho (F1): rate > 0.9 in both states. 3 templates, 607 events.
Fixed che (F0): rate < 0.1 in both states. 12 templates, 1,299 events.
Switchable (S): absolute difference between states ≥ 0.2. 11 templates, 1,322 events.
Intermediate (I): does not meet any of the above criteria. 5 templates, 857 events.

How does it work?
Imagine each folio of the manuscript has a switch X that can be set to one of two positions: O or E.
Now suppose we look at a folio where the switch is set to E. What words will we find?

chol and chel? Mostly chol, just like in all folios. Their template chXl is fixed cho (Rate = 1.000 in O-folios, 0.948 in E-folios).
The template structure drives the vowel choice here: the switch has almost no effect.

shody and shedy? Mostly shedy. Their template shXdy is switchable (Rate = 0.870 in O-folios, 0.096 in E-folios).
In E-folios, shody is rare; in O-folios, shody dominates. This is where the switch has its strongest effect.

choey and cheey? Only cheey, never choey, regardless of the switch position.
Their template chXey is fixed che with a rate of exactly 0.000 in both states. This is an absolute rule that the switch cannot override.

chool and cheol? Only cheol, never chool, again regardless of the switch.
Their template chXol is fixed che (Rate = 0.031 in O-folios, 0.008 in E-folios).
The system appears to avoid placing two o vowels adjacent across the digraph boundary.

The switch does not operate uniformly: it strongly modulates some word contexts (the switchable templates) while leaving others
completely untouched (the fixed templates). The template (the consonantal environment surrounding the substitution site) determines
whether the switch has any power at all.

This aligns nicely with Jorge's intuition that the A/B split could be two encodings of the same underlying mechanism.


But the full picture is more nuanced. There are in fact two independent signals hiding inside the Currier A/B distinction:

A strong bimodal signal (the one we just explained), based on the cho/che alternation. This is a discrete boolean switch set once per folio.

A much weaker gradient signal carried primarily by the d/l pair (and possibly others, which I have not yet attempted to identify).
This is not discrete: 70.6% of folios fall in the middle range, and the bimodality coefficient (0.319) is well below the threshold for bimodality. It behaves like a continuous dial, not an on/off switch.

Currier's A/B classification conflates these two signals into a single binary label, which is why it works reasonably well but leaves 71% of inter-folio variance unexplained.

To me, the most intriguing finding is that these two signals are completely orthogonal. Within the Herbal section, the correlation between the cho/che ratio and the d/l ratio is essentially zero. Knowing the switch state of a folio tells you nothing about its d/l ratio, and vice versa. They are independent dimensions of the manuscript's structure, driven by separate mechanisms that happen to partially align in Currier's coarse classification.
(05-05-2026, 10:07 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.To me, the most intriguing finding is that these two signals are completely orthogonal. Within the Herbal section, the correlation between the cho/che ratio and the d/l ratio is essentially zero. Knowing the switch state of a folio tells you nothing about its d/l ratio, and vice versa. They are independent dimensions of the manuscript's structure, driven by separate mechanisms that happen to partially align in Currier's coarse classification.

I don't think I understand most of this, could you maybe explain the implications a bit more? If the "signals" do not map to A/B then are we just talking about some arbitrary split of folios into groups and then identifying some metric orthogonal to this division? What's the use of this? I think you mentioned some "structure" that this approach unveils, what is this structure?
Pages: 1 2 3 4 5