Options

Currier A/B split is not what we thought it was!

Index
Currier A/B split is not what we thought it was!
Currier A/B split is not what we thought it was!

Labyrinthinesecurity > Yesterday, 03:09 PM
Hi all,

Some of you may have seen my earlier work confirming the Currier A/B distinction quantitatively. That paper showed the distinction is real, recoverable without labels, and predictive. But it also left a puzzle on the table that I could not explain at the time. I now have an explanation, and it leads somewhere unexpected.

Of the eleven character pairs I tested, one behaved paradoxically. The e/ch pair had essentially zero global correlation with the A/B split, yet it produced the strongest signal of all pairs at folio boundaries. And when included in clustering, it actively destroyed the A/B partition: removing it doubled the clustering accuracy.

How can a pair be simultaneously invisible globally, maximally informative locally, and destructive to classification? That combination is not possible under a simple two-language model. Something more structured is going on.

The answer turns out to be surprisingly clean. If you look at the vowel that follows the digraphs CH and SH across the manuscript, you find that folios split into two sharply separated groups. The gap between these two states is enormous, and a two-state binomial mixture model fits with a 2,549-point AIC improvement over a single state. Of 197 folios, 195 are assigned unambiguously.

This is not the same thing as the Currier A/B split, although it correlates with it. It is sharper, it operates at the individual folio level rather than at section boundaries, and it persists within the Herbal section alone (where the A/B boundary is supposed to be clean).

I call it a boolean switch: a single binary parameter, set once per folio.

Here is where it gets interesting. If the switch were just replacing graphemes uniformly, every word containing those graphemes would respond the same way. They do not.

When you group words into templates, you find three classes:
1. Fixed O templates: these are locked to O in both switch states.
2. Fixed E templates: these are locked to E in both switches states.
3. Switchable templates: these respond strongly to the switch.
Template identity accounts for 93.5% of the variance. The folio switch accounts for only 7.9%.

So the system has two components: a template structure that determines which contexts are switchable, and a boolean parameter that modulates the switchable ones. The Currier A/B distinction is a blurred projection of this system, not the system itself.

The e/ch pair is paradoxical because it responds to the boolean switch, but only in switchable template contexts. In clustering, the e/ch ratio injects variance along a dimension that does not align with the primary A/B axis. Mystery solved.

Now for the part that surprised me the most

Everything above is derived purely from text statistics. I had no reason to expect it would connect to anything visual. But then I found Koen's morphometric study of the Herbal plant illustrations (You are not allowed to view links. Register or Login to view.), which classifies plants as A-type or B-type based on twelve visual features: stem-root lines, flower morphology, daisy-type flowers, grass elements, root platforms, leaf venation, and so on. This classification was done entirely from the drawings, with no reference to text statistics.

I cross-validated my boolean switch against the morphometric classification on 101 Herbal folios (excluding quire 8). The results:

Boolean switch vs. morphometrics: 96.0% agreement, Cohen's kappa = 0.870, Fisher's exact p = 3.5 x 10^-15.
Currier vs. morphometrics: 78.2% agreement, Cohen's kappa = 0.106.

Read that kappa for Currier again: 0.106. Once you correct for base rates, Currier's section-level labels have almost no predictive power for plant morphology. The boolean switch, derived from a single text ratio, predicts the visual classification of the plant drawings with near-perfect accuracy.

Every Currier discordance is resolved by the switch

Of the 27 folios where my switch disagrees with Currier's label, 18 have morphometric data. In all 18 cases, the plant illustration sides with my switch, not with Currier. The probability of that under the null is 3.8 x 10-6.

These are not marginal cases. Folios like f31r, f34v, f39r, f43r, f46r, and You are not allowed to view links. Register or Login to view. are all traditionally classified as Currier A because they fall in the f1-f57 range. But their text is E-dominant, and their plant illustrations show B-type features (daisies, grass, root platforms, unidirectional leaves). Conversely, f87r, f90r, f93v, and You are not allowed to view links. Register or Login to view. are traditionally Currier B, but their text is O-dominant and their plants show A-type features (stem-root lines, A-type flowers and calyxes).

The switch is not just a better statistical classifier. It is detecting the same organizational principle that the illustrator(s) was(were) following.
RE: Currier A/B split is not what we thought it was!

nablator > Yesterday, 03:34 PM

(Yesterday, 03:09 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Folios like f31r, f34v, f39r, f43r, f46r, and You are not allowed to view links. Register or Login to view. are all traditionally classified as Currier A because they fall in the f1-f57 range. But their text is E-dominant, and their plant illustrations show B-type features (daisies, grass, root platforms, unidirectional leaves). Conversely, f87r, f90r, f93v, and You are not allowed to view links. Register or Login to view. are traditionally Currier B, but their text is O-dominant and their plants show A-type features (stem-root lines, A-type flowers and calyxes).

Hi,

Where did you find this (wrong) information?

Folios 31, 34, 39, 43, 46, 48 were always Currier B, and f. 87, 90, 93, 96 Currier A.

You are not allowed to view links. Register or Login to view.
RE: Currier A/B split is not what we thought it was!

Labyrinthinesecurity > Yesterday, 03:43 PM

My bad thanks for the correctoon
RE: Currier A/B split is not what we thought it was!

oshfdk > Yesterday, 04:13 PM

(Yesterday, 03:43 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.My bad thanks for the correctoon

Does this mean "the switch" just aligns with Currier A/B?
RE: Currier A/B split is not what we thought it was!

Grove > Yesterday, 04:16 PM

I might not be understanding what you’re describing here. Are you saying that ‘Che’ is more common in language A folios and ‘cho’ more dominant in B?
If so, a quick glance at You are not allowed to view links. Register or Login to view. language A has only 2 ‘Che’ and I think 8 ‘Cho’. I must be misunderstanding something.
RE: Currier A/B split is not what we thought it was!

Labyrinthinesecurity > Yesterday, 04:54 PM

(Yesterday, 04:13 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.
(Yesterday, 03:43 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.My bad thanks for the correctoon

Does this mean "the switch" just aligns with Currier A/B?

It reveals a deeper structure, I will share the results tomorrow
RE: Currier A/B split is not what we thought it was!

nablator > Yesterday, 05:22 PM

(Yesterday, 04:13 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Does this mean "the switch" just aligns with Currier A/B?

If the binary "switch" is #cho > #che or simply #ho > #he, it doesn't match the Currier language very well, there are many exceptions.
RE: Currier A/B split is not what we thought it was!

Labyrinthinesecurity > Yesterday, 05:41 PM

(Yesterday, 05:22 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(Yesterday, 04:13 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Does this mean "the switch" just aligns with Currier A/B?

If the binary "switch" is #cho > #che or simply #ho > #he, it doesn't match the Currier language very well, there are many exceptions.

The exceptions are driven by the fixed templates / switchable templates mixture
RE: Currier A/B split is not what we thought it was!

oshfdk > Yesterday, 06:08 PM

I don't understand what exactly templates and mixtures mean here, just wanted to say that given the tendency of the manuscript to mix prefixes and suffixes apparently randomly (I think there was a nice post by @dashstofsk with examples of this, can't find it now), it's expected for some combinations or characters to randomly align to some features with no underlying connection at all, especially if the system has a lot of special rules and exceptions. I'm not saying this is what happens here, but "just so" is not necessarily a bad explanation in many cases.
RE: Currier A/B split is not what we thought it was!

rikforto > Yesterday, 07:12 PM

(Yesterday, 03:43 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.My bad thanks for the correctoon

I am going to note that this is not an answer to the question, "Where did you get this (wrong) information from?", which I think some people might want the answer to before a lot of energy is sunk into evaluating this analysis
Next Oldest Next Newest

Currier A/B split is not what we thought it was!

Index

Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!

RE: Currier A/B split is not what we thought it was!