The 'Chinese' Theory: For and Against

The 'Chinese' Theory: For and Against - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html)
+--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

RE: The 'Chinese' Theory: For and Against - tavie - 17-02-2026

(17-02-2026, 11:19 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.So, please, people, isn't it time to admit that the "therefore" above was wrong?

You know what I'm going to say, Jorge: first, you would have to establish that there was a "therefore" for "everyone".

You've made similar points earlier in this thread and in other You are not allowed to view links. Register or Login to view., so I'll emphasize to any newcomers that there is no forum dogma that the language has to be European. The forum gave the Turkish solution a lot of goodwill at the start because some people already thought Turkish was a plausible candidate language due to its grammar. And some people here don't even think there is a language. And for those here who do think it is a European language, I doubt that many are inflexibly wedded to that view.

Ultimately, we've yet to see any theory - whether European or 'Chinese' or neither - explain all the manuscript's strange statistics.

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 18-02-2026

(17-02-2026, 11:44 PM)tavie Wrote: You are not allowed to view links. Register or Login to view.first, you would have to establish that there was a "therefore" for "everyone".

OK, sorry for the unduly generalization. I see that there are other "acceptable" theories, like gibberish or constructed language; and "European" has been taken in the broad sense, to include Hebrew, Arabic, and Turkish. And now and then there are newcomers who suggest other languages, usually ones that they happen to know something about.

And, IIUC, you believe that it is possible and fruitful to do "agnostic" research, without assuming anything about the language or origin. I personally don't think that such research can lead anywhere, but anyway it is an alternative.

But most people who reject the Chinese Theory flat out, and now refuse to accept SPS≈SBJ, seem to do so because of that argument: "it looks European, therefore it must be European". Like the post I was replying to.

All the best, --stolfi

(17-02-2026, 11:44 PM)tavie Wrote: You are not allowed to view links. Register or Login to view.Ultimately, we've yet to see any theory - whether European or 'Chinese' or neither - explain all the manuscript's strange statistics.

Well, name some of those "strange statistics", and I will see if I can explain them within the Chinese Theory.

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - kckluge - 18-02-2026

(17-02-2026, 05:43 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(17-02-2026, 01:27 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.
(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The problem is that there is no such thing as "the SBJ sensu stricto".

This is indeed a problem! I don't see how we can definitively identify the text if we don't even know what the text is, and I don't see an explanation of how you are overcoming this problem.

Forget Chinese and the SPS for a moment, and suppose instead that someone finds an "Alchemist Herbal" in Italian (AHI) which has about the same number of plants as the Herbal-A section of the VMS (VHA). He wonders whether VHA may be a version of AHI. He notes the following things:
The most common word in AHI, A, occurs seven times in the longest entry of the AHI.

Eacluding two of them, the other five occurrences of A are separated by 32, 13, 25, and 30 words.

The most common word in VHA, V, occurs five times in the longest entry of the VHA.

I'd probably find it a little problematic if I looked at their own transcription of VHA and couldn't for the life of me figure out how they managed to conclude that V was "(t)he most common word in VHA". I already pointed out this problem with your repeated assertion that 'daiin' is the most common word in the SPS in You are not allowed to view links. Register or Login to view.:

Quote:Looking at Stolfi's transcription (You are not allowed to view links. Register or Login to view.), 'daiin' does indeed occur 306 times as a 5-gram ignoring spaces. Taking spaces into account that breaks down as follows:
[...]
Treating uncertain spaces as spaces:
129 as word by itself (9th most common word)
167 as word suffix
2 as word prefix
8 as "...d aiin..."
[...]
But there is another, bigger problem here -- why, in crowning it as "(t)he most common Voynichese word in the SPS," is 'daiin' treated as privileged? The actual most common word in the Stolfi's transcription of the SPS is 'chedy' (175 occurences) if uncertain spaces are ignored, and it's the second most common word if uncertain spaces are treated as spaces (205 occurences). It you ignore spaces 'chedy' occurs 545 times -- well over the 306 times for 'daiin.' So why isn't 'chedy' "(t)he most common Voynichese word in the SPS"? "Because the spacing of five of the 306 instances of 'daiin' fit fairly well with the spacing of five of the seven instances of 主 in one paragraph out of 365 in the SBJ" is not a good answer to that question.

To elaborate: here are the 10 most common words in your transcription of the SPS treating uncertain spaces as spaces:

Word: aiin chedy ar al qokeey ol y qokeedy daiin l

Rank: 1 2 3 4 5 6 7 8 9 10
Count: 255 205 204 184 155 145 133 132 129 126

So as stated before, 'daiin' as a word is actually only the 9th-most common word. OK, well maybe we didn't catch all the spaces. Let's suppose we consider cases where word <x> occurs as a suffix of some other word. As noted above, that adds another 167 instances of 'daiin.' The problem is, it also brings the total for 'chedy' up to 537 and for 'aiin' (excluding instances of 'daiin') up to a whopping 955.That's without bothering to run the numbers for words ranked 3-8. So if we include cases where <x> is a suffix in the count, how exactly is 'daiin' more common than 'aiin' or 'chedy'? Just counting those cases they are both more common than the total number of 306 'daiin' as a 5-gram ignoring spaces altogether.

If you addrressed that issue in an earlier reply I missed it.

RE: The 'Chinese' Theory: For and Against - Yavernoxia - 18-02-2026

(18-02-2026, 08:21 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I'd probably find it a little problematic if I looked at their own transcription of VHA and couldn't for the life of me figure out how they managed to conclude that V was "(t)he most common word in VHA". I already pointed out this problem with your repeated assertion that 'daiin' is the most common word in the SPS in You are not allowed to view links. Register or Login to view.:

To elaborate: here are the 10 most common words in your transcription of the SPS treating uncertain spaces as spaces:

Word: aiin chedy ar al qokeey ol y qokeedy daiin l

Rank: 1 2 3 4 5 6 7 8 9 10
Count: 255 205 204 184 155 145 133 132 129 126

So as stated before, 'daiin' as a word is actually only the 9th-most common word. OK, well maybe we didn't catch all the spaces. Let's suppose we consider cases where word <x> occurs as a suffix of some other word. As noted above, that adds another 167 instances of 'daiin.' The problem is, it also brings the total for 'chedy' up to 537 and for 'aiin' (excluding instances of 'daiin') up to a whopping 955.That's without bothering to run the numbers for words ranked 3-8. So if we include cases where <x> is a suffix in the count, how exactly is 'daiin' more common than 'aiin' or 'chedy'? Just counting those cases they are both more common than the total number of 306 'daiin' as a 5-gram ignoring spaces altogether.

If you addrressed that issue in an earlier reply I missed it.

Good point. I’m sure that if one randomly chooses another medieval manuscript whose text is composed of many lists and short paragraphs, one can find statistical properties similar to those proposed by Stolfi, especially if one allows manipulation of spacing, the distribution of suffixes and prefixes, and similar features. For example, the Speculum historiale section of the Speculum Maius by Vincent de Beauvais may be a good candidate.

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 18-02-2026

(18-02-2026, 08:21 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I already pointed out this problem with your repeated assertion that 'daiin' is the most common word in the SPS.

Sorry for wasting your time with that confusion. Indeed, you are right, daiin is not the most common word in the SPS.

But the argument for "主≈daiin" is not "both are the most common words in each file". That would be a bogus argument. In the translations of the Pentateuch into Chinese and Vietnamese that I have at hand (definitely the same book, in two monosyllabic languages), the most common words are  的 = "of" and người = "man, person, people", respectively.

And likewise the argument that f105v.32-38 is a version of the "rooster" recipe is not that they are both the longest parags in each file.

Those observations (right or wrong) are only what led me to check whether those parags and those words corresponded to each other.

The evidence that VMS f105v.32-38 is a version of the SBJ "rooster" recipe, and that daiin is the VMS equivalent of 主 (or 主治) is the close match of the distances between occurrences of those words in those two parags.

More precisely and generally, the claim is that, modulo occasional "typos", the word or suffix daiin in the Voynichese version of the SBJ (the SPS) generally occurs near the beginning of each recipe or sub-recipe to introduce the list of uses (indications, benefits, effects, etc). A role that in the Chinese version of the SBJ is usually played by the characters 主治 = "main use(s)", but sometimes 主 = "mainly" alone, or (in other recipes) 治 = "use(s)", or sometimes other constructions (like in the "Eggs" sub-recipe of Rooster).

In retrospect, I should have given only the evidence, without telling how I got to it. (And, blush, that is the very advice that I gave to @Rafal for his Rohonc paper, a few weeks ago. See @Rafal, why you should not do that?)

So, it does not matter now whether daiin is really the most common word in the SPS. If I counted wrong, that was another lucky break for me (besides the correspondence between Chinese and Voynichese being nearly 1:1 character-for-token, the longest recipe surviving the loss of the central bifolio and the caca the Scribe made on f108v-f111r, and the omitted parts of the recipe being at the ends rather than in the middle, and that SPS parag still being the longest one in spite of those omissions.)

But how did I get that count wrong? Trying to reconstruct how I got to that claim, it seems that my reasoning was the other way around: first I guessed that f105v.32-38 was the "rooster" recipe (both the longest parags in the two files), then I saw that 主 occurred 7 times in that recipe, and so I looked for what were the most common strings in that paragraph that could correspond to it. And daiin occurs 5 times in that parag, whereas chedy occurs only 2. (I looked for strings rather than words because I did not trust spaces, not even by counting commas as spaces.)

It turns out that aiin occurs 18 times in that parag, and kaiin occurs 5 times too. And in fact the "eggs" sub-recipe, which has no 主 in the SBJ, seems to have an lkaiin at the place where the uses of "eggs" should start. Hmm...

But, well, again, apologies for wasting your time.

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 18-02-2026

(18-02-2026, 09:30 AM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.I’m sure that if one randomly chooses another medieval manuscript whose text is composed of many lists and short paragraphs, one can find statistical properties similar to those proposed by Stolfi, especially if one allows manipulation of spacing, the distribution of suffixes and prefixes, and similar features. For example, the Speculum historiale section of the Speculum Maius by Vincent de Beauvais may be a good candidate.

Well, okay, let's check. Do you have a link to that book? (I suppose that there is no digital transcription of it, right?)

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - Yavernoxia - 18-02-2026

(18-02-2026, 01:38 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(18-02-2026, 09:30 AM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.I’m sure that if one randomly chooses another medieval manuscript whose text is composed of many lists and short paragraphs, one can find statistical properties similar to those proposed by Stolfi, especially if one allows manipulation of spacing, the distribution of suffixes and prefixes, and similar features. For example, the Speculum historiale section of the Speculum Maius by Vincent de Beauvais may be a good candidate.

Well, okay, let's check. Do you have a link to that book? (I suppose that there is no digital transcription of it, right?)

All the best, --stolfi

I'm not sure if a digital transcription exists, but you can find the You are not allowed to view links. Register or Login to view.. It was just an example though, there are countless medieval books with short entries about recipes, the zodiac, astrology, etc...

RE: The 'Chinese' Theory: For and Against - nablator - 18-02-2026

(18-02-2026, 01:44 PM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.It was just an example though, there are countless medieval books with short entries about recipes, the zodiac, astrology, etc...

No short paragraph in there. If you have links for these countless books that have short entries about recipes or astrology, please post them. I know some with 12 entries for the zodiac and houses, 7 for planets. We need 300-400.

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 18-02-2026

(18-02-2026, 01:44 PM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.I'm not sure if a digital transcription exists, but you can find the You are not allowed to view links. Register or Login to view..

Thanks, but its entries are way too big:

Filename: SpeculumHistoria_pg100_1.jpg Size: 305.51 KB 18-02-2026, 02:46 PM

That "recipe" for pasta with mortadella and salami seems to be shorter than average, yet it has over 200 words. A candidate text shoulld have ~35 words per "recipe". Unless you claim that the languages/encoding of the VMS is not 1:1 token:token...

Quote:It was just an example though, there are countless medieval books with short entries about recipes, the zodiac, astrology, etc...

Indeed there must be thousands of old manuscripts out there that match the Starred Parags section as well as my SBJ file.

Namely, other copies of the Shennong Bencaojing. Big Grin

The SBJ was a very popular book throughout East Asia, and was translated and transcribed all over the region. Even into languages that were not monosyllabic or tonal, like Japanese and Korea. Think of Disoscorides or Hippocrates in Europe. While India had its own ancient medical tradition, I bet that the SBJ was not unknown there. So probably any library East of Tibet has some old version of the SBJ in their "rare books" shelf.

And of course there are millions of modern editions of it, anywhere there are fans of Chinese medicine.

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - rikforto - 18-02-2026

I had some time to actually dig into this this morning, and I'd like to offer both a further reason I doubt the match and also a point of entry for anyone (including Jorge) who wants to prove me wrong.

There obvious next place to look for cribs is the following and preceding paragraphs of the VMS. Unfortunately, the paragraph that Jorge analyzed is the last on a verso page, so identifying the "next" paragraph requires grappling with the construction of the VMS. We get very lucky in the preceding paragraph, however, and that is what I'd like to analyze.

Please note that I am using the traditional characters rather than the simplified ones that Jorge uses because I am unfamiliar with the PRC reforms; for the two paragraphs in question, it is exactly a one-to-one mapping and a spot check elsewhere makes me think they are identical texts aside from the character set. The way the PRC reforms work is this should have no bearing on any analysis we might like to do. This text can be found You are not allowed to view links. Register or Login to view., with tools for translation. I have colored the character Jorge claims to have identified in red, and a run of characters which appear near 主 in the "rooster" entry and then four in a row in the "donkey-hide gelatin" entry in blue. There are other matches, but these are especially easy to work with.

(donkey hide gelatin) 阿膠　味甘平。主治心腹內崩，勞極灑灑如瘧狀，腰腹痛，四肢酸疼，女子下血，安胎。久服輕身益氣。一名傳致膠。
(rooster) 丹雄雞　味甘微溫。主治女子崩中漏下，赤白沃，補虛，溫中，止血通神，殺毒，闢不祥。頭，主殺鬼。肪，主治耳聾。腸，主治遺溺。肶胵裹黃皮，主治洩利。屎白，主治消渴，傷寒寒熱。翮羽，主下血閉。雞子，除熱火瘡，癇痓，可作虎魄神物。雞白蠹，肥豬。生平澤。

I have run into several problems extending this match along this otherwise promising line. First, as has been covered elsewhere, "治" does not have an obvious match in the "rooster" paragraph, making it hard to say where exactly 女子 starts. My best attempt, allowing for the possibility that "味甘微溫" was not in the source is as follows:
poar keeo ? daiin qoair ar aCPHey
丹雄雞主治女子

The problem is less acute with 下血:
daiin oCKHhy yShey
主下血

The implication is that the following correspond:
女子下血 =
ar aCPHey oCKHhy yShey

Something interesting does happen under this hypothesis. I've reproduced the You are not allowed to view links. Register or Login to view. above the putative "rooster" one in the VMS with the naive guess for where they should align in green.
tdol tor oaldar aiir okokeedy karody qoeedy sho qopchedy daiin opairam
dchedy cheey qokor otaiin otair otair okeedy taiin aiin s aiin sy
ychtaiir aiichy dol aiin otaiin aiidy okchd otar daiin
Shifted back a few you do get a y-y across a line break. It seems like a good entry point, but the following matches poorly to my estimation:
ar aCPHey oCKHhy yShey
s aiin sy ychtaiir
Anyone is welcome to play with this some more---I can imagine the case for rolling up a bit more at the end of the second line into single characters. aiins aiinsy is a decent candidate for a two-character read as well, and you could probably cut it up differently. Ultimately, none of those look like especially good matches.

Before I abandon the match entirely, there is an internal match in the hypothesized donkey-hide gelatin paragraph:
(donkey-hide gelatin) 阿膠　味甘平。主治心腹內崩，勞極灑灑如瘧狀，腰腹痛，四肢酸疼，女子下血，安胎。久服輕身益氣。一名傳致膠。
By good fortune, the second and last should match. But the line ends with daiin, our candidate for 主, or a homophone. In the You are not allowed to view links. Register or Login to view. You are not allowed to view links. Register or Login to view., they've avoided falling remotely together where attested.

All in all, I cannot extend the analysis on these lines.

Some stipulations and responses to anticipated objections:

This analysis here presents compelling problems for holding the match affirmed, but is not enough to say the matter is closed. Just because I have failed to make the match---or see it---does not mean its not there, of course.
Specifically, I am uncertain enough in how to line up the texts that other people may see something I've missed. If I the problems here don't convince you that they don't seem match, I'd be curious where you see a potential way to strengthen the match and how to make it more certain to continue checking these sorts of connections.
It's possible that the SPS is based on a different textual tradition than the version of the SBJ we are looking at and the "donkey" paragraph is missing in the VMS version. That is still a problem for holding that the text has been identified because it is explicitly based on the argument that the SPS is a hitherto unidentified and unanalyzed text. New data may yield a new analysis, but it is incumbent on proponents of the match to provide it.
It is possible that the SPS as we've received it has this paragraph from our version of the SBJ and it is not directly above. People are invited to find a more suitable candidate.
It is possible that translation, corruption, or some other process has made identification of the paragraph above the claimed "rooster" paragraph impossible. That argument remains unfalsifiable, and implicitly concedes we cannot recognize the match, even if the person making that argument doesn't see it that way.

With all that in mind, I would like to reiterate: a spurious correlation between those 7 instances of 主 in the rooster paragraph and the loose family of words daiin/dain/laiin in the SPS still adequately explains the difficulties extending the match. There is no need to appeal to undiscovered versions of the SBJ, translators, dictators, L2 scribes, retracers, or any other person or force frustrating our efforts. The null hypothesis, that the SPS does not match the SBJ, is perfectly capable of explaining why we are having such difficulty mapping Voynichese to the SBJ, even if it is not uniquely capable.