The 'Chinese' Theory: For and Against

The 'Chinese' Theory: For and Against - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html)
+--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 14-02-2026

(13-02-2026, 03:48 PM)MHTamdgidi_(Behrooz) Wrote: You are not allowed to view links. Register or Login to view.with all the twists and turns you are giving to make the back story plausible

Because the explanations that people have been giving for how an European language could be so well encrypted, how come the plants and cosmology are not recognizable, why the Zodiac diagrams have 30 labels each instead of 28/30/31 and why some are split into 15+15, and what are those nymphs doing in those tubs and showers between organs -- those are not "twists and turns", right?

"A small community of people who invented a secret language and script to communicate among themselves"

"A swindler who used an invented script and complicated and laborious method to produce random text, that to Europeans at the time would have looked utterly unlike language or code, with not a single reference to alchemy, in order to sell it to an Emperor who was obsessed with gold-making alchemy."

"A scholar who was afraid that the Inquisition, which he was sure would be created by the Church any time soon, would burn him at stake for his heretic thoughts, and therefore cleverly disguised them in a book with filled with bizarre attention-grabbing illustrations, in a baffling script that looks totally like an attempt by someone trying to hide heretic thoughts from the soon-to-come Inquisition".

And hundreds more...

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - JoJo_Jost - 14-02-2026

I would like to add the following: The positions are clear, and the topic has already been discussed almost exhaustively. From here on, we will probably find ourselves going round in circles more and more.

My own conclusion from the arguments put forward so far is: yes, there is an very interesting statistical signal.

However, in my professional life, I have seen statistical signals of a mathematical nature that were so striking that they seemed to explain a lot – and later turned out to be pure coincidence. Therefore, I am sceptical ‘for professional reasons’. And healthy scepticism is a good thing.

In this sense, it is just as difficult to claim that this is proof as it is to claim that it is not proof. But that is exactly my point:

if, on the basis of this thesis, further interesting patterns emerge when comparing these two texts, this will increasingly support Stolfi's thesis. And I will wait for that.

As I said, I genuinely hope Stolfi has found it. Wink

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 14-02-2026

(13-02-2026, 04:38 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.If after hunting for many years you managed to identify the best statistical fit for SPS

I did not hunt for books that fitted the statistics of the SPS.

The closest thing to that was some statistical analysis I did, 20 years ago, on three Chinese texts (the Bible, some Voice of America transcripts, and a 17th century novel) one Vietnamese text (the Bible), and a couple of Tibetan ones (commentaries on commentaries of Buddhist texts) to compare their word and character distributions to Voynichese.

Which turned out to be very similar to those of Voynichese; whereas the statistics of European and Semitic texts were not similar at all.

Results which of course were totally ignored (not disputed, just ignored) by all the "Obviously European" voynichologists.

Besides those texts above, the only East Asian book for which I have any statistics at all is Shennong Bencaojin. I don't remember when it was that I learned about it, and learned that it was supposed to be a collection of 365 recipes. That, and knowing that it was the most famous materia medica at the time,  is what made me suspect that the SPS (whose original size had been widely speculated to have been 365 parags) could be a copy of the SBJ. But I did not pursue that at the time.

It was only in the past year that I managed to get a digital transcription of the SBJ, and decided to check that old guess.

Turns out that I was very lucky. The SPS parags that corresponded to the shortest and longest recipes of the SBJ managed to escape the loss of the central bifolio, and the mangling of parags by the Scribe on pages You are not allowed to view links. Register or Login to view. and f111r, and the translation/transcription turned out to be almost word-for-word. As a result, the min, max, and average parag sizes (counting words) of the two texts matched surprisingly well. And the histograms had the same general shape, although the SPS one was broader.

I posted those numbers and the histogram to this forum some months ago, but again that finding was ignored by most. Only a couple of flat-earthers bothered to reply, flatly declaring that "there is no resemblance at all".

Then I spent several hundred hours re-transcribing the whole SPS and carefully re-checking the parag boundaries, on the assumption that the discrepancy of the histograms was due to errors in the latter. It turns out that it was mostly wasted effort. Most of the previously marked parag breaks were correct, because they were totally obvious; and those that were not obvious - including the two large blocks on You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view. -- were impossible to split correctly.

But then it occurred to me to compare the longest recipe of the SBJ and the longest parag of the SPS, hoping to find correspondences between their words.

And again I was very lucky. That longest "recipe" turned out to be seven separate recipes, for seven different products obtained from the rooster; and each began with the same character 主 meaning "main" (six of them in the two character expression "main use"), which happens to be the most common character in the SBJ. And then I found that the occurrences of that character (or phrase) in that longest recipe matched closely the occurrences of daiin, which happened to be the most common word in the SPS, in the longest parag of the latter.

And then I could see that the SPS version had omitted some parts of the SBJ recipe. But fortunately the omitted parts were at the beginning and at the end of the recipe, and thus did not affect the spacings between occurrences of 主 and daiin.

Had any of those lucky breaks failed to occur, I probably would have given up on that idea.

To me, those matches are as good evidence as one could ask for. They do not prove my proposed scenario (maybe it was the Chinese Dictator who traveled to Europe, and the dictation happened there; or whatever), but there is no escaping the conclusion that the SPS is an almost word-for-word version of the SBJ.

But it seems that some people here will never accept the conclusion. They will deny it even if one day we find a contemporary Chinese painting showing the Author taking dictation of the SBJ, with his signature under a nice couplet in Voynichese script...

Still, the violence of the reaction this time tells me that the evidence is good. But save your flames, I will soon post more...

All the best, --stolfi

RE: The 'Chinese' Theory: For and Against - Typpi - 14-02-2026

(14-02-2026, 06:50 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.And then I found that the occurrences of that character (or phrase) in that longest recipe matched closely the occurrences of daiin, which happened to be the most common word in the SPS, in the longest parag of the latter.

I'll try again and maybe you can tell me what I'm missing.

But why would only one word/phrase correspond? Shouldn't the words around it also match and have a similar structure?

Couldn't I find a ton of books that have the same "matching" word positions in a ton of different languages? Especially if we're only matching one word based on spacing and ignoring the words around it?

You also said you didn't look through many Chinese books.. would this method work on other Chinese books or is this a unique case? Or have you not tested that yet?

RE: The 'Chinese' Theory: For and Against - kckluge - 14-02-2026

Quoting Stolfi's paper:

"...word spaces in the SPS are quite variable, and there is a substantial number of ambiguous gaps which may or may not be word spaces. Moreover there is evidence that the Scribe(s) who penned 6 the text sometimes omitted a word space altogether, or inserted one where it did not belong....The most common Chinese character in the SBJ is 主 (zhˇu) with the general meaning of “main”, “principal”. It occurs 398 times in our SBJ file, or about 1.11 times per parag on average...The most common Voynichese word in the SPS is daiin, which occurs 306 times, or 0.93 per parag. (In this count and in the rest of this section we are ignoring word spaces in the SPS, so that kydaiin and daiiny count as occurrences of daiin.) When considering the hypothesis that the SPS is some version of the SBJ, it is natural to investigate whether the daiin may be the Voynichese equivalent of the Chinese character 主."

Looking at Stolfi's transcription (You are not allowed to view links. Register or Login to view.), 'daiin' does indeed occur 306 times as a 5-gram ignoring spaces. Taking spaces into account that breaks down as follows:

Ignoring uncertain spaces:
108 as word by itself (7th most common word)
192 as word suffix
4 as word prefix
2 as "...d aiin..."
Treating uncertain spaces as spaces:
129 as word by itself (9th most common word)
167 as word suffix
2 as word prefix
8 as "...d aiin..."

Qualitatively Stolfi's observations about scribal error are perfectly reasonable. Quantitatively I'm concerned that we're getting into special pleading territory here, especially because Stolfi's numeric comparison with the count for 主 implies that essentially all those prefix/suffix cases are errors to make it work. I also worry that this implies scribal error rates large enough to make any kind of argument involving word counts or lengths questionable at best -- best case scenario here is that only 129/306 = 42% of the instances of the word 'daiin' are properly delimited by spaces (including uncertain ones).

But there is another, bigger problem here -- why, in crowning it as "(t)he most common Voynichese word in the SPS," is 'daiin' treated as privileged? The actual most common word in the Stolfi's transcription of the SPS is 'chedy' (175 occurences) if uncertain spaces are ignored, and it's the second most common word if uncertain spaces are treated as spaces (205 occurences). It you ignore spaces 'chedy' occurs 545 times -- well over the 306 times for 'daiin.' So why isn't 'chedy' "(t)he most common Voynichese word in the SPS"? "Because the spacing of five of the 306 instances of 'daiin' fit fairly well with the spacing of five of the seven instances of 主 in one paragraph out of 365 in the SBJ" is not a good answer to that question.

I will, however, grant that the paragraph length histogram comparison is an at least mildly interesting result. I just think Stolfi is way out over his skis with the 'daiin' = 主 claim.

RE: The 'Chinese' Theory: For and Against - kckluge - 14-02-2026

(14-02-2026, 05:04 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.What I have noted many times is that the per-word entropy (about 10 bits) is compatible with many natural languages, including East Asian ones, under any spelling system or encoding where each word is (almost) always spelled/encoded in the same way. In fact, IIRC, it was rene who first made this observation.

Two points:

1) I suspect that is a consequence of (and mathematically equivalent to) the ranked word frequency distribution being Zipfian given that entropy is a measure of how not-flat the distribution is -- while that's a property any theory of the nature of the text has to account for, it is a necessary but not sufficent property for the text to be words in a natural language, and

2) as I observed earlier, your argument regarding 'daiin' requires assuming that roughly (at best) only 42% of instances of 'daiin' are written as 'daiin' (as opposed to as a prefix or suffix of another "word"), which does not qualify as an "encoding where each word is (almost) always spelled/encoded in the same way."

RE: The 'Chinese' Theory: For and Against - dashstofsk - 14-02-2026

(14-02-2026, 05:04 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.So now, to prove that claim SPS=SBJ, we would have to show that some consequences predicted by this hypothesis are unlikely to be the result of random chance, as they would have to be under the SPS≠SBJ hypothesis.

To demonstrate some confidence in the hypothesis that SBJ is quire 20 you would need to show that it shares many of the oddities, anomalies and statistical irregularities. Doing it just for one paragraph isn't enough.

Here are some suggestions for you.

The text in quire 20 is composed of two language clusters. ( See You are not allowed to view links. Register or Login to view. ). If you could do an analysis of the SBJ and show that its pages also show two distinct clusters then that would be something worthy to report.

You could do with 'zhu' in the SBJ something similar to what I have done with daiin in quire 20 ( see my output below ), to randomly place 'zhu' to get a measure of its variability within the SBJ. And then to see if there is some match of this statistic between the two texts. My simulations, after doing 1000 of them, show that the measure of variability of daiin in quire 20 is 2.94 standard deviations away from what would be expected if all the words daiin were placed randomly.

You are not allowed to view links. Register or Login to view.

In quire 20 20% of the text is made up of 16 words. I showed the list previously [ You are not allowed to view links. Register or Login to view. ]. Is it so also with the SBJ?

daiin is not the most common word in quire 20. The frequencies of ar chedy are both higher. Is there a match for these within the SBJ?

These are the sort of things you need to do.

RE: The 'Chinese' Theory: For and Against - oshfdk - 14-02-2026

(14-02-2026, 06:50 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I did not hunt for books that fitted the statistics of the SPS.

I think you did. Not with a calculator, but still you were looking for a text that would fit the profile of SPS - a large number (preferably ~300) of relatively short paragraphs. You didn't consider huge 1000 page long tomes, you didn't consider short treatises either, because these wouldn't match quite obviously. So you ran a pre filter that removed most books with wildly different statistics from consideration.

Probably you were still dissatisfied with the match and you had to design custom paragraph breaks in SPS to make the data fit better. Otherwise I think you would just be happy with the obvious paragraph brakes.

RE: The 'Chinese' Theory: For and Against - Stefan Wirtz_2 - 14-02-2026

Funny how this is circling around the same (brittle) "fact" again and again:

"it must be written by an Italian!" -- that claim is supported by nothing else but "style" impressions. Which could easily be imitated.

Just as reminder: "Europe" looked this way around VMS time (a bit later here, Constantinople/Byzanz existed until 1453 of course and some balkanese affairs happened before):

[Image: f2bb23cf1279afccf2e2e07c8db94042.jpg]

[Image: f2bb23cf1279afccf2e2e07c8db94042.jpg]

- Europe did not end 2km northeast of Venice.

- Central Asia did not begin somewhere east of Caspian Sea or Gobi desert - refer to any TO Map of 15th century for border lines of Europe to Asia. (Linguistic Central) Asia reached deep into geographical europe.

- all "countries" here were and are filled with multiple different languages or dialects; a lot of them with clearly asian origins.

- as an example, the western mongolians changed their writing system at least twice, coming to a top-down writing direction

- we know where the VMS ended and may know some intermediate positions, but we don't know where it started it's journey.

So who could ever confirm or exclude any place of origin or source language for VMS even only for "Europe"...?

RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 14-02-2026

(12-02-2026, 04:48 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.主 appears at the very beginning of each recipe, after the name and the category, with only 2-3 exceptions. If 主 corresponds to daiin, where is the same pattern in the Voynich MS? Is there a guaranteed daiin near the beginning of each paragraph? This is visually the most obvious pattern in SBJ.

Good question. Actually, the honest question would be, "Considering that 83% of SBJ recipes have a 主治 = "main use" within the first 12 characters or so, which word or pattern occurs in a similar percentage of the SPS parags within the first 60 EVA characters or so". Since it seems that on average each Chinese character in the SBJ corresponds to about 5 EVA characters in the SPS.

If we exclude the problematic blocks of lines of the SPS where the parag breaks are not obvious, we are left with 243 "probably true" parags.

Of these, 51 (21% of the 242) have a daiin (as a single word or part thereof) within the first 60 EVA characters.

Of the remainder, 20 (8% of the 242) have a dair or a laiin (or both) in that range. (Those are the two other words that match the positions of 主 in the Rooster recipe).

Of the remainder, 63 (26% of the 242) have at least one of kaiin, taiin, lair, kair, tair in that range (those are guesses for other possible pronunciations/spellings/translations of 主治, based on those two alternatives above).

In all, these options cover 55% of the 242 "good" parags. Note that these counts do not include split occurrences (like the ~d.aiin in the Rooster translation), not variants like doiin or daiis, nor possible abbreviations like dam.

Are all those aiin/air words above variant pronunciations/spellings of "main use"?  What about the other 28% needed to match the SBJ's 83%? I don't know yet. Will keep looking into it.

Maybe in the VMS the "main uses" keyword is often omitted, because it is superfluous. Note that the shortest SBJ recipe omits that keyword altogether. After all, the first 主治 serves only to separate the list of diseases from the "taste and flavor" field; if that field is present in an SBJ recipe but is omitted in the SPS version (as it was in the Rooster case), the translation of the following 主治 could have been omitted too.

All the best, --stolfi