Currier A/B split is not what we thought it was!

Currier A/B split is not what we thought it was! - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Currier A/B split is not what we thought it was! (/thread-5710.html)

Pages: 1 2 3 4 5

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 05-05-2026

(05-05-2026, 04:14 PM)Grove Wrote: You are not allowed to view links. Register or Login to view.Isn’t that cho mapped to A opposite of what you said here:

all traditionally classified as Currier A because they fall in the f1-f57 range. But their text is E-dominant, and their plant illustrations show B-type features (daisies, grass, root platforms, unidirectional leaves). Conversely, f87r, f90r, f93v, and You are not allowed to view links. Register or Login to view. are traditionally Currier B, but their text is O-dominant and their plants show A-type features (stem-root lines, A-type flowers and calyxes).

Isn’t that cho mapped to A opposite of what you said here:

all traditionally classified as Currier A because they fall in the f1-f57 range. But their text is E-dominant, and their plant illustrations show B-type features (daisies, grass, root platforms, unidirectional leaves). Conversely, f87r, f90r, f93v, and You are not allowed to view links. Register or Login to view. are traditionally Currier B, but their text is O-dominant and their plants show A-type features (stem-root lines, A-type flowers and calyxes).

Ahh, I see I misunderstood gut have misread that. So che dominant is your B and che is A.

Che dominant means B. (Che isnt A).

The mnemonic: A goes with cho (the default/dominant form), B goes with che (the alternative).

RE: Currier A/B split is not what we thought it was! - dashstofsk - 05-05-2026

(05-05-2026, 03:03 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.87v and 90v (v1+v2)

Compared to other Herbal A1 pages they seem to be a bit low in words that contain cho and high in words that contain s.

Generally, the text in the Herbal A1 pages in quires 15 and 17 seem to be different to the Herbal A1 pages in earlier quires. Perhaps they have more in common with Pharma A1 pages. These two quires might have been written at a different moment to the others and this is consitent with the hypothesis that the manuscript was written in stages 'by section' and not in one go.

RE: Currier A/B split is not what we thought it was! - nablator - 05-05-2026

(05-05-2026, 03:59 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.the cho counter sums both cho and sho. Likewise for the che counter: it sums both che and she.

f2r: 17 cho+sho, 6 che+she → r_cho = 0.739 (73.9% cho) → Language A
f2v: 21 cho+sho, 4 che+she → r_cho = 0.840 (84.0% cho) → Language A

In the RF1b-e.txt file (from your You are not allowed to view links. Register or Login to view.) I have:
f2r: 18 cho+sho, 6 che+she
f2v: 24 cho+sho, 7 che+she

[BTW RenéZ might have updated a few lines without changing the version number, you don't have the latest.]

These counts are produced by the switch.py script.

c = classify_word_cho(w)
if c == 'cho':
cho_c += 1
elif c == 'che':
che_c += 1

...

def classify_word_cho(word):
...
if has_cho and not has_che:
return 'cho'
elif has_che and not has_cho:
return 'che'
return None

For example in You are not allowed to view links. Register or Login to view. I see 7 che+she.

But cheopchor and chotchey are classified as None, so it should be 5 not 4 che+she. I must be missing something...

>>> classify_word_cho('cheor')
'che'
>>> classify_word_cho('sshey')
'che'
>>> classify_word_cho('chees')
'che'
>>> classify_word_cho('cheaiin')
'che'
>>> classify_word_cho('cheol')
'che'

Something is wrong with parse_ivtff. '<->' should be processed as a word separator and what's inside {} should not be skipped. 'chees' is not counted because 'chokeos<->chees' gets converted to a unique word 'chokeoschees' classified as None.

Why not simply count occurrences of [cs]ho and [cs]he and avoid all the complications?

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 05-05-2026

(05-05-2026, 07:08 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(05-05-2026, 03:59 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.the cho counter sums both cho and sho. Likewise for the che counter: it sums both che and she.

f2r: 17 cho+sho, 6 che+she → r_cho = 0.739 (73.9% cho) → Language A
f2v: 21 cho+sho, 4 che+she → r_cho = 0.840 (84.0% cho) → Language A

In the RF1b-e.txt file (from your You are not allowed to view links. Register or Login to view.) I have:
f2r: 18 cho+sho, 6 che+she
f2v: 24 cho+sho, 7 che+she

[BTW RenéZ might have updated a few lines without changing the version number, you don't have the latest.]

These counts are produced by the switch.py script.

c = classify_word_cho(w)
if c == 'cho':
cho_c += 1
elif c == 'che':
che_c += 1

...

def classify_word_cho(word):
...
if has_cho and not has_che:
return 'cho'
elif has_che and not has_cho:
return 'che'
return None

For example f2v: I see 7 che+she.

But cheopchor and chotchey are classified as None, so it should be 5 not 4 che+she. I must be missing something...

>>> classify_word_cho('cheor')
'che'
>>> classify_word_cho('sshey')
'che'
>>> classify_word_cho('chees')
'che'
>>> classify_word_cho('cheaiin')
'che'
>>> classify_word_cho('cheol')
'che'

run the script as is using the transliteration file as input: any error?

RE: Currier A/B split is not what we thought it was! - nablator - 05-05-2026

(05-05-2026, 08:01 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.run the script as is using the transliteration file as input: any error?

I did. No error, but you should fix parse_ivtff. <-> and curly brackets are not comments.

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 05-05-2026

(05-05-2026, 08:01 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.
(05-05-2026, 07:08 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(05-05-2026, 03:59 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.the cho counter sums both cho and sho. Likewise for the che counter: it sums both che and she.

f2r: 17 cho+sho, 6 che+she → r_cho = 0.739 (73.9% cho) → Language A
f2v: 21 cho+sho, 4 che+she → r_cho = 0.840 (84.0% cho) → Language A

In the RF1b-e.txt file (from your You are not allowed to view links. Register or Login to view.) I have:
f2r: 18 cho+sho, 6 che+she
f2v: 24 cho+sho, 7 che+she

[BTW RenéZ might have updated a few lines without changing the version number, you don't have the latest.]

These counts are produced by the switch.py script.

c = classify_word_cho(w)
if c == 'cho':
cho_c += 1
elif c == 'che':
che_c += 1

...

def classify_word_cho(word):
...
if has_cho and not has_che:
return 'cho'
elif has_che and not has_cho:
return 'che'
return None

For example f2v: I see 7 che+she.

But cheopchor and chotchey are classified as None, so it should be 5 not 4 che+she. I must be missing something...

>>> classify_word_cho('cheor')
'che'
>>> classify_word_cho('sshey')
'che'
>>> classify_word_cho('chees')
'che'
>>> classify_word_cho('cheaiin')
'che'
>>> classify_word_cho('cheol')
'che'

run the script as is using the transliteration file as input: any error?

I'm not familiar with the intricacies of the STA standard, will have to check the parsing and the contents of f2 when I have access to a computer.

RE: Currier A/B split is not what we thought it was! - nablator - 05-05-2026

(05-05-2026, 08:39 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.I'm not familiar with the intricacies of the STA standard, will have to check the parsing and the contents of f2 when I have access to a computer.

It's IVTFF.
"<->" are illustrations that separate words, they should be processed like ".", not ignored if you want to extract words.
"{...}" for example "{cko}" are unusual connected glyphs (other than the most usual 'benched' combinations starting with c/s and ending with h): they should not be removed.

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 05-05-2026

nablator dateline='[url=tel:1778010306' Wrote: You are not allowed to view links. Register or Login to view.1778010306[/url]']

Labyrinthinesecurity dateline='[url=tel:1778009963' Wrote: You are not allowed to view links. Register or Login to view.1778009963[/url]']
I'm not familiar with the intricacies of the STA standard, will have to check the parsing and the contents of f2 when I have access to a computer.

It's IVTFF.
"<->" are illustrations that separate words, they should be processed like ".", not ignored if you want to extract words.
"{...}" for example "{cko}" are unusual connected glyphs (other than the most usual 'benched' combinations starting with c/s and ending with h): they should not be removed.

eyeballing the ivtff, the discrepancy you observe is due to chokeos<->chees being processed as asingle word by the parser. it shouldnt be much of a problem (since cho count also decreases, not just che count, what's more this combo is not that common) but i will upgrade the parser and rerun the tests just in case there are unforeseen side effects, i appreciate your help !, thansk
i t make

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 06-05-2026

I have fixed the <-> parsing problem and rerun the tests: in the Herbal section, which is the most critical for extracting the switch feature, the impact is not statistically significant.

Before:
Herbal folios: 125
Classifiable (>= 5 words): 124
sigma=1 (cho-dom): 92
sigma=0 (che-dom): 32
Mean r_cho (sigma=1): 0.752 +/- 0.134
Mean r_cho (sigma=0): 0.170 +/- 0.095

After:
Herbal folios: 125
Classifiable (>= 5 words): 124
sigma=1 (cho-dom): 92
sigma=0 (che-dom): 32
Mean r_cho (sigma=1): 0.752 +/- 0.134
Mean r_cho (sigma=0): 0.169 +/- 0.087

In You are not allowed to view links. Register or Login to view., you wanna look at the last table called "APPENDIX: HERBAL FOLIO DETAILS" for per folio computations in Herbal section, here are the first few lines (sig 1 means Currier A, sig 0 means Currier B):

==============================================================================
APPENDIX: HERBAL FOLIO DETAIL
==============================================================================

Folio sig n_cho n_che r_cho Conf e/ch d/l Words
----------------------------------------------------------
f1v 1 27 6 0.818 1.000 0.306 0.478 85
f2r 1 18 6 0.750 1.000 0.372 0.559 100
f2v 1 22 5 0.815 1.000 0.321 0.556 56
f3r 1 30 17 0.638 1.000 0.443 0.364 111
f3v 1 22 6 0.786 1.000 0.340 0.419 83
f4r 1 16 3 0.842 1.000 0.258 0.500 63
f4v 1 27 11 0.711 1.000 0.439 0.677 78
f5r 1 14 7 0.667 1.000 0.545 0.750 53
f5v 1 17 4 0.810 1.000 0.269 0.556 43
...

Interestingly, the switchable templates have now more impact on the Currier classification the switch now accounts for 9% of the variance (versus 8% before), and we have 4 more templates, 2 of which are new switchable templates.

chXckhy 0.909 0.237 11 59 0.672 S-switchable
shXcthy 0.900 0.182 10 22 0.718 S-switchable

RE: Currier A/B split is not what we thought it was! - Labyrinthinesecurity - 21-05-2026

Continuing my investigations:
1) r_cho predicts the Currier language with 96% accuracy across all folios with words: in the image below, on the left hand side, you see that the B folios (orange squares) are very well separated from the A folios (blue dots).
2) r_cho is not primitive: it is driven by r_oe: the ratio of o versus e. In the same image, on the right hand side, you see that the clouds are tilted in the same orientation when we use r_oe (unlike r_cho). But r_oe is less good at predicting the Currier language, the clouds "overlap" more.
3) r_chsh is a secondary variable which seems to distinguish a "subdialect" in currier B: specifically in the recipe section, folios containing tailed stars (in red) are separated from folios containing notail stars (in green). The big green diamond is folio 111v, which is a mix of tail stars and notail ones. Of course the number of star folios is very small and imbalanced, so this must be taken with a piece of salt.

In my opinion, this supports the hypothesis that Voynich is generated using per-folio variables. There are at least two independent variables:
1) r_oe, the dominant variable driving the Currier distinction via r_cho
2) r_chsh, a secondary variable driving the star shapes in the recipe section.

You are not allowed to view links. Register or Login to view.