The 'Chinese' Theory: For and Against

The 'Chinese' Theory: For and Against - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html)
+--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

RE: The 'Chinese' Theory: For and Against - JoJo_Jost - 11-02-2026

@stolfi Thank you for the additional information, I understand that too. You know I think very highly of you, so no offense. But of course I also have to defend my theory that it is Bavarian in a certain way Wink

i know u understand that.
-----
I don't know Chinese, so I have to look it up myself. So the following is just a question; I want to understand it.

I listened to the different "bais" in a German-Chinese dictionary. They sound the same, but have different meanings, and if they have different meanings, they are also different characters.

You are not allowed to view links. Register or Login to view.

Then i ask Gemini, ChatGPT und Perplexit. They all say the same!

For example, “bai” is actually pronounced differently, but it is then also written differently, both in the transcription, with different “macrons” above the “a,” and also as Chinese characters. Then it also has a different meaning, like this:

[*]bāi (1st tone) – 掰
English: to break apart / to snap off (with the hands) // Constant high pitch, like singing a high note.
[*]bái (2nd tone) – 白
English: white; also “in vain / for nothing” (in some phrases) // Sounds like a question: "What?" or "Huh?"
[*]bǎi (3rd tone) – 百
English: hundred // Dip your voice low and bring it back up slightly.
[*]bài (4th tone) – 拜 / 败
拜 (bài) – English: to pay respect; to worship; to visit formally
败 (bài) – English: to lose / to be defeated / failure // Sharp and short, like a firm command: "Stop!"

And now the question: Is it still pronounced differently despite being spelled the same? I can't judge that.

However, if I only go by the Chinese characters, there are still dopplers that clearly do not correspond to Voynich.

Filename: als Zeichen CHina Stolfi.png Size: 140.58 KB 11-02-2026, 03:58 PM

RE: The 'Chinese' Theory: For and Against - kckluge - 11-02-2026

A couple quick comments:

* There was an (inadvertent?) deletion of text at the start of Section 3.1 ("...instances of words, abstract strings o Independently of the precise...")

* Section 3.1 ("...eventually we confirmed that the distribution of the lengths of words from the Voynichese lexicon was indeed characteristic of a group of East Asian languages...") and Section 3.4 ("The average size of an SBJ parag is 36.97 Chinese characters, while the average size of an SPS parag is 33.95. These numbers are surprisingly close, and imply that, if the SPS is a version of the SBJ, then each Voynichese word in the former corresponds roughly to one Chinese character in the latter.") treat spaces as syntactically significant word separators, yet Section 4 ignores them ("(In this count and in the rest of this section we are ignoring word spaces in the SPS, so that kydaiin and daiiny count as occurrences of daiin.)"). I am troubled by the paper's assuming that spaces matter to claim word length and paragraph word count distributions match in Section 3 but then assuming they can be ignored to get matches to "daiin" in Section 4.

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 11-02-2026

(11-02-2026, 02:01 AM)MHTamdgidi_(Behrooz) Wrote: You are not allowed to view links. Register or Login to view.You acknowledge in your article that a quire of four pages is “visibly missing,” but that is not operationalized in your calculations and conclusions, except for some statistical approximation likelihoods that can hardly count as evidence. The same goes for the likelihoods of (re)ordering of the paragraphs.

That four pages are missing from the SPS is a fact, as certain as anything can be.

We can also take as a fact that the Scribe merged a bunch of parags into one giant parag on pages You are not allowed to view links. Register or Login to view. and f111r. That is demonstrated not only by the way-out-of-ordinary size of the resulting parags but mainly by the multiple stars in the margin, next to them -- and the almost perfect matching of stars and parags in the rest of the section.

We should also assume that, besides those two blocks, the scribe also may have suppressed a few parag breaks here and there, or inserted a few spurious ones.

And we also know that the Scribe often ran words together, or split a word in two.

Therefore, if one wants to find out whether the SPS could be a copy/translation/tranbscription of some other book by comparing the numbers and sizes of paragraphs, one cannot demand exact matches. As in any experimental science, we know that the data has errors and limitations, and we must use statistics to get around them. Any conclusions will be based on approximate numbers and approximate matches. That is the way scientists work, and that does not invalidate their conclusions.

Quote: 360 and 365 are not unique, odd numbers. Either astrologically or astronomically, you can plausibly find literature across cultures in which something was said about them with a paragraph devoted to them, because of Zodiac month counts or days of the year.

Sure. The Zodiac section is incomplete too, but we all believe that it originally had 360 labels (and about that number of stars and nymphs) because we extrapolate from the pages we have, and because we believe that the labels are degrees of angle along the Ecliptic. And before I knew about the SBJ I guessed that the parags of the SPS too were related to days of the year or degrees of arc, and therefore the total count (before the four pages were lost) must have been 360 or 365.

The SBJ is generally said to have 365 recipes, but no one seem to know the reason for that number. Originally it may have had some connection to the days of the year, but that connection seem to have been lost more than 1000 years ago, even before the book itself was lost. AFAIK the present reconstructions make absolutely no mention of astrology, astronomy, or calendar. (And no mention either of mythology, philosophy, ritual, etc.) The book is traditionally organized as three sections of 120 + 120 + 125 recipes, each divided into subsections for remedies derived from Minerals, Herbs, Trees, Animals, Fruits, and Cereals. The three major sections are "high grade" remedies and tonics that are just good and can be taken regularly, "medium grade" that should be taken with discretion, and "low grade" that are toxic and should be taken only in small doses when really needed.

Quote:And if you search and compare enough, you may find statistical matches that may even be more coincidental. This is what coincidences do and 360/365 cannot serve as a smoking gun for a discovery

I never claimed that the similarity in the estimated number of entries was evidence of anything. It is just what led me to investigate the possibility that the SPS could be the SBJ. if the SPS had 100 recipes or 1000, I would not have bothered.

Quote:In the Voynich manuscript there is an enormous number of graphics/images. The Chinese text you are comparing the last section with is only text-based. I was asking, you expect us to assume somehow such a text would be able to explain and inspire the extent and amount of images we find in the Voynich manuscript in such graphic detail?

Not at all. I believe that each section of the VMS was transcribed/translated/etc from a separate book, and there is no connection between the SPS/SBJ and the other sections besides the language and script.

The source book for the Cosmo section may have had diagrams. The source for the Bio section may have had anatomical drawings. The source for Pharma may have had drawings of plant parts. But most of the illustrations, including all the nymphs, are clearly decoration that do not transmit any information. I suppose that the Author asked the Scribe to provide those illustrations because the intended European readers or buyers would expect them.

Quote:You are offering a “solution” for the section of the Voynich manuscript and then expect that it would explain not just other section texts, but the enormous amount of images found there?

I expect that the finding "SPS≈SBJ" will help "decipher" the rest of the manuscript, yes, but only because the language and script seem to be the same.

That will probably not help much with the Cosmo and Zodiac sections, because the SBJ must have vey few terms in common with those sections. But it may help more with Bio, Herbal, and Pharma, since the topics should overlap a lot.

Quote:you have relied on a text no one can read which then serves you to compare with a language (Chinese) you don’t speak yourself to establish statistical patterns

I would have been delighted if I had found out that the source book for the SPS was the Divina Commedia and the language was Italian. But I cannot change the facts just because I don't like them...

Quote:You write, "To split imperfect blocks, we set a 'definitive' parag break after every short line, even if the next line cannot get a starlet assigned to it ... two decisions divide each imperfect block into 'tentative parags'."

You should check the page-by-page report ( You are not allowed to view links. Register or Login to view. ), loot at the page images ( You are not allowed to view links. Register or Login to view. ) and decide whether you think I "cheated" or what.

Note that the existing transcriptions already divide that section into parags. Each transcriber had to make up his his own criteria. I merely revised the division trying to make it as accurate as possible.

Quote:This just illustrates the tentativeness of how you have divided the paragraphs and then wish to draw definitive statistical conclusions from it for a definitive discovery claim.

Removing 4 pages from a 22-page booklet will not appreciably change the average paragraph size, or its standard deviation. (Unless those pages are very special, like having 3x as many parags, all very small. Indeed I am betting that the lost pages were not too different from those that survived.) It could change the min and max parag sizes, but fortunately that did not happen -- bot the smallest and the largest SBS recipes were in in the pages that survived.

Changing the positions of the parag breaks would have no effect on the avg parag size. It could increase or decrease the dev, but if the parag breaks are moved only a line or two the effect on the dev would be small.

Merging or splitting parags would change the avg and dev of the parag sizes. But if such joinings and splittings are few and roughly balanced, the effect on the avg should be small, and the dev would increase only a little.

Joining and splitting words will change the parag sizes, and thus change the avg and dev too; but, again, if they are roughly balanced, the avg size should hardly change, and the dev should had another modest increase.

And those disturbances should not radically change the shape of the histogram of the parag sizes, only broaden it and make the bars smaller overall (because of the lost pages) and a bit more irregular.

In conclusion, in spite of those disturbances, the avg SPS parag size should be substantially correct,
the dev should be somewhat greater than it was originally, and the histogram should be somewhat more spread out but still have roughly the original shape.

Therefore, I think that the histogram comparison alone should have made the "SPS≈SBJ" theory quite likely. But of course it depends on one's prior convictions. If one is already hard convinced that the language is Latin, or that the SPS is a list of daily horoscopes, the (min,max,avg,histogram) match will not be enough to change one's views.

BUT all that is moot now. Even if the (min,max,avg,histogram) did not match, the evidence on page 8 should be proof enough that at last that parag of SPS is an almost word-for-word translation or transcription of the "Rooster" recipe.

Quote:What I was saying was, why not ask a Chinese scribe to copy it IN CHINESE, and bring THAT home?

Again: because the Author had learned the spoken language, but had no hope of learning the written one.

To be minimally literate in Chinese (enough to, say, read a newspaper) one must have memorized at least a couple thousand characters. Chinese students reach that level by the end of high school. When I was learning Japanese, after a year I had managed to learn only a hundred or so (and forgot most of them already..)

And I am not making up that scenario. The Jesuits who reached China and Vietnam after 1500 invented phonetic scripts for those languages -- even though they had already their own scripts, and some Jesuits managed to become literate in them. For the same reason: to make the language more accessible to the folks back home in Europe.

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 11-02-2026

(11-02-2026, 01:59 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Yes, but there are also dain and daim and other similar strings, which didn't correspond to anything, so they were not included in the graph

Evidently those other strings are not translations/transcriptions of 主 or 主治.

"You are claiming that 'gato' in that manuscript means 'cat', but then later on you decide that 'gata' also means 'cat', but there is also a 'gota' and 'pato' and other similar strings that you did not include in the graph"

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - kckluge - 11-02-2026

(11-02-2026, 05:48 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I am troubled by the paper's assuming that spaces matter to claim word length and paragraph word count distributions match in Section 3 but then assuming they can be ignored to get matches to "daiin" in Section 4.

To clarify, I understand that the claim in Section 4 isn't that spaces don't matter, but rather that (1) spaces can be uncertain ("...word spaces in the SPS are quite variable, and there is a substantial number of ambiguous gaps..."), and (2) scribes make errors ("Moreover there is evidence that the Scribe(s) who penned 6 the text sometimes omitted a word space altogether, or inserted one where it did not belong."). Those are both perfectly reasonable points. The problem is that you have to assume the right wrong spaces to get some of the "daiin" matches. Unless I'm mistaken this is the transcribed text of the paragraph in question used in Section 4, with the 8 matches to "daiin" including "the similar words dair and laiin as possible quasi-homophones" highlighted:

f105v 32 38 <%>poar.keeo,daiin.qoair.ar.acphhey.qoeedeody.qokaiin.qotedair.apo,rair,apy-
lsheody.tair.oteey.oteeo.ol.otaiin.okeey.qokaiin.or.aiir.al.dar-
sheeo.daiin.chsd.qokeeey.dair.okaiin.otaiin.chedaiin.olkal.lkl,dain-
doee.okcheeo.ltaiin.otcheedy.chor.aiin.odaiin.chedy.otaiin.al.kaishd-laiin.sheod.okeeody.qoaiin.ytaiin.otair.chdal,dy,daim.chdaiin.ockhhy-
yshey.ckhy.sheo.qoeeo.lkaiin.chs,okol.tchdy.sheeey.okaiin.ar.aildy-
cheody.oaiir.ain.okshey<$>

So 4 of the 8 require ignoring spaces. Do any of those "missing" spaces look uncertain? What kind of scribal error rate do we have to assume if "missing" spaces involving "daiin" are typical, and how reasonable is that? Also note that this error-prone scribe doesn't miss any of the spaces *after* "daiin/dair/laiin", only before (although given that "aiin" is [almost?] always word-final I'm willing to consider that being an easy rule for the scribe to remember).

Re: "The most common Voynichese word in the SPS is daiin, which occurs 306 times, or 0.93 per parag. (In this count and in the rest of this section we are ignoring word spaces in the SPS, so that kydaiin and daiiny count as occurrences of daiin.) When considering the hypothesis that the SPS is some version of the SBJ, it is natural to investigate whether the daiin may be the Voynichese equivalent of the Chinese character 主." If you *don't* ignore spaces, "daiin" is the 7th most common word at 121 occurences(*), while "aiin" is the most common at 193 (only ~7 of which follow a word ending in 'd'). if "daiin" is "zhu", what is "aiin" -- if this is some kind of phonetic encoding they presumably should be related -- and does that make sense in context?

(*) using the transcription Stolfi was initially referencing in this thread:
"# The Starrred Parags section from the VMS.
# Extracted from the Landini/Zandbergen/Stolfi interlinear version 1.6e6,
# with some cleanup.
# Takeshi Takahashi's transcription (';H') only.
# Pages You are not allowed to view links. Register or Login to view. to You are not allowed to view links. Register or Login to view. line 30.")

I want to be clear that this is intended as constructive criticism, not piling on.

RE: The 'Chinese' Theory: For and Against - kckluge - 11-02-2026

(11-02-2026, 01:51 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(11-02-2026, 11:08 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.the decision to include laiin and dair, but at the same time ignore dain and daim and kaiin and taiin seems completely arbitrary.

It was not arbitrary. After marking the five daiin in the SPS entry and the seven 主 in SBJ entry, I saw that there were a laiin and a dair at positions that closely corresponded to the two unmatched 主.

All the best, --stolfi

Is there a spit-take emoji?

RE: The 'Chinese' Theory: For and Against - oshfdk - 11-02-2026

(11-02-2026, 07:21 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Evidently those other strings are not translations/transcriptions of 主 or 主治.

"You are claiming that 'gato' in that manuscript means 'cat', but then later on you decide that 'gata' also means 'cat', but there is also a 'gota' and 'pato' and other similar strings that you did not include in the graph"

Again, this is an explanation, this is not evidence. You can explain most, if not all, strange things with the Chinese theory, but for the Chinese theory to have any footing there should be direct evidence. "what if 主 matches with daiin, but it doesn't match well, but if we also match it with only one of two dairs and one laiin, then it does" - this is just a very convoluted way of saying "主 does NOT match with daiin".

RE: The 'Chinese' Theory: For and Against - ReneZ - 12-02-2026

As is often the case, also here, many of the counter-arguments presented are false/invalid.
In the case, this is mostly because people oversimplify things, add their own assumptions, and then (try) to counter the result of this. Also, many people do not know enough about Chinese. Finally, Stolfi is not saying that the language is Mandarin Chinese, so (as I understand it), while 主 is zhu3 in monder Mandarin, it is not proposed that daiin is a rendition of zhu3.

As a result, while it would be nice, it is not to be assumed that a syllable like zhi3 in Voynichese should be similar to daiin (at all!).

Other problems are that, while xi with different tones are different words, also xi1 (high tone) can have several completely different meanings.

Even if (big if!!) daiin means 'use', this is still at the very beginning of a theory, and the next step is not obvious.

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 12-02-2026

(11-02-2026, 04:25 PM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.But of course I also have to defend my theory that it is Bavarian in a certain way

Well, I do not yet know what is the language, or how daiin was pronounced by the Author. All I can say is that the SPS seems to correspond to the SBJ almost word-for-word. And I was already convinced that Voynichese was a monosyllabic language.

So, I suppose it still could be a word-for-word transcription of the SBJ into Bavarian...

Quote:I listened to the different "bais" in a German-Chinese dictionary. They sound the same

I cannot hear the Chinese tones myself, unless I hear two syllables that differ only in tone said next to each other. And even then I cannot tell which tones -- only that they are different.

In Mandarin there are four tones. The words bái bài bāi and bǎi are distinct and unrelated, like English 'bat', 'bet', 'bit', 'bot', 'but'. Or Portuguese "avô" (grandpa) and "avó" (grandma). Or German "schon" (already) and schön (beautiful)...

Quote:Is it still pronounced differently despite being spelled the same?

You mean 拜 (bài) and 败 (bài)? Like any language, Mandarin has homophones -- words that are pronounced the same way even though they have different meanings and different origins. I gather that in at least some cases the two meanings are written with different Chinese characters. Like English "two", "to", and "too"...

Quote:However, if I only go by the Chinese characters, there are still dopplers that clearly do not correspond to Voynich.

I already discussed the cases of bái 白 and zǐ 女, but the general answer is that, if the same Chinese character is used in two different compounds, the readings of the compound in Voynichese may use different sounds.

As for jī 鸡, in particular, the SPS apparently omitted the last two sentences of the SBJ recipe: jī bái dù, féi zhū 鸡白蠹，肥猪 "[Also to treat] chicken lice and to fatten pigs", which is about a veterinary use and is claimed to be a late addition to the SBJ; and hēng píng zé 生平泽 "It is found in marshlands", a statement that is "required" by the fixed formula of SBJ recipes but makes no sense in this case.

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 12-02-2026

(11-02-2026, 07:28 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.So 4 of the 8 require ignoring spaces. Do any of those "missing" spaces look uncertain? What kind of scribal error rate do we have to assume if "missing" spaces involving "daiin" are typical, and how reasonable is that?

The fact is that the relative positions of the five out of the seven occurrences of 主 in the longest SBJ recipe match closely the relative positions of the five occurrences of daiin in the longest SPS parag, either as a "word" or as a suffix. Whether you measure the positions in words or in EVA characters. And if we include dair and laiin, the match extends to all seven 主, with one dair left unmatched.

Why are three of the daiin suffixes, instead of isolated words? Why does 主 sometimes match dair and sometimes laiin, but not every dair matches a 主? Those are good questions, but they cannot justify dismissing those 5 or 7 close positional matches as "mere coincidences".

That fact implies that one Voynichese word, as transcribed, is not simply one Chinese character -- even though the number of words (73) and the number of characters (92 - 12 = 80) are quite similar, once one excludes the 4 characters of the "taste and warmth" field and the 8 characters of the last two Chinese sentences 鸡白蠹，肥猪 (which seems to be about veterinary use, and are claimed to be a late addition) and 生平泽 ("It is found in marshlands", part of the fixed SBJ formula but which makes no sense in this entry).

I have one possible explanation for dair: the n in the in ending is sometimes written with a rounded bottom and slightly above the baseline. Then the in can be mistaken for an r with a lowered plume:

Filename: 2026-02-11-234905-daiin-to-dair.jpg Size: 19.36 KB 12-02-2026, 03:57 AM

All the best, --stolfi