The Voynich Ninja
The 'Chinese' Theory: For and Against - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html)
+--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 10-02-2026

(10-02-2026, 09:57 PM)MHTamdgidi_(Behrooz) Wrote: You are not allowed to view links. Register or Login to view.It seems “guess” work has helped the charts quite a bit. “In the transcription files, the boundaries between those hypothetical parags had to be guessed based on dubious clues like the spacing of the lines; but they are almost certain to be wrong, because the actual breaks must be in the middle of lines. Besides those two large blocks, there are a few smaller suspicious text blocks on other pages, each being probably two or three parags merged into one. Misplacing the parag breaks would not affect the average parag size, but would affect its deviation and the shape of the histogram.” (p. 6). All this sounds like arbitrarily introducing line breaks or modifications,

That is a good point.  After people claimed that the histograms were not similar enough for their liking,  I spent easily 100 hours carefully re-transcribing the entire SPS and re-checking the parag breaks.  But that effort was done without looking at the SBJ, and they involved only a fraction of the SPS (basically two out of the 22.6 pages), and in the end neither the breaks nor the transcription changed much relative to Rene's transcription.  And did not improve the match very much.   

You can check You are not allowed to view links. Register or Login to view., page by page, and the You are not allowed to view links. Register or Login to view. here.  In this directory you will find You are not allowed to view links. Register or Login to view. (~20 MB each) to help understanding those notes.  (I should have usd JPG instead of PNG to make them smaller.  And You are not allowed to view links. Register or Login to view. is low res, sorry. Will fix these problems eventually...)  You will see that, apart from those two pages (f108v and f111r), most of the page breaks are obvious and not "cheatable".

All that work done, I did try to redo the histograms after excluding all the blocks of lines where the parag divisions were not 100% obvious, hence subjective and probably wrong.  That left 243 parags instead of the full 330. It improved the histogram match a bit, but not enough to justify the hassle of having to explain the filtering and arguing that it was not cherry-picking, so I left that attempt out of the paper.  Here is the SBJ histogram in red, that 243 "good parags only"  histogram in graysish green, and the raw 330 all-parags histogram in dark blue. Note that the vertical scale is different. The last royal-blue histogram is the "good parags only" again but measuring the size of a parag by the EVA character count instead of the word count.

Quote:All the enormous graphics in the VM are then just made up to make the Chinese text plausible, in oddly new writing alphabet?

Sorry, I don't understand the question.  Which "enormous graphics"?

Quote:Are you saying, the traveler could not find a scribe in the whole of China to simply copy the original in Chinese?

Learning spoken Chinese is not much harder than learning any other language.  In fact it may be easier, because Chinese has no inflections for gender, number, person, mood, tense, etc, which are the bane of learners of languages like Latin, German, of Finnish.  

Leraning the written language is another matter.  One must memorize ~4000 characters in order to be basically literate.  Chinese (and Japanese) students learn a couple hundred per year, and reach that level only at the end of high school.  (Japanese manga books used to display on the cover the number of years of schooling one needed in order to read them.)  Some Jesuit missionaries managed to become literate, after years of intense effort, because they knew that had to in order to pursue their mission of converting the ruling class to Catholicism. Even them found that, in order to make the language more accessible to their Western colleagues, they had to invent a phonetic script for it.  That is why they invented pinyin and the modern Vietnamese script, even though those languages already had their own scripts.

Thus, copies of the books in Chinese characters would be quite useless for him after he got home, because he would be unable to read them.  With the phonetic script, he could get the sound of every word, and his knowledge of the spoken language would have been enough to get at least some of the sense out.

Quote:If he did not even read Chinese, how could he know the value of a purely textual manuscript?

If he had asked any doctor in the Chinese area of influence "what is the most important book of medicine you have", the answer would quite probably have been "here, this one, the Shennong Bencaojin". 

(And, by that time, the Chinese already were printing books in large numbers with full-page carved wood blocks. The technical name, IIUC, is "incunabula".  Such books were produced in Europe too for a hundred years or so, using copper plates, until Gutenberg invented moveable type. (Which apparently was already in use in Korea before that.))

Quote:If he did, or he could have someone translate it for him, why not write it in his own language?
 

Because no one, not even him, could have done that.  No one in "China" knew Latin or German or whatever.  He could have translated the most common terms, but very few of the 365 remedy names and the thousands of disease names.  He could only hope that he could somehow assemble a glossary of those before returning.

Quote:If somebody had found something of value in the original Chinese, why not just bring the original and share it in 1400s, so it could become known then centuries earlier?

But that is precisely what I believe he hoped to do!

Europeans reached Southeast Asia by sea a bit after 1500, after which Europe quickly got to know Chinese culture.  By 1700 the Jesuits already had many people who could read Chinese characters and many more who could speak the language. The first Chinese-Latin dictionary, using pinyin for the Chinese, was printed in Rome around 1580 -- by that same Jesuit who managed to become literate and who invented pinyin.

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - RadioFM - 10-02-2026

Wouldn't it be more productive to pick apart the study itself, rather than Stolfi's post hoc backstory? After all, he arrived at the East Asian / Chinese Origin Theory by studying the VMS corpus and its statistics. His backstory about a Retracer, an Author, etc. is just a putative tale, guesswork made after-the-fact, not the backbone of his work.

@Jorge_Stolfi
Have you made any attempts to match 治 (zhì) -- or whatever token is most common after 主 (zhû) -- with vords after daiin
You mention spelling variations could be at play in the VMS, but is there a 2-vord digram family that could tentatively match 治主 in the Shennong Bencaojin? Doesn't the VMS display greater unpredictability in terms of the following word after daiin?

Also, since you are making the case of possibly phonetic rendition, and surely have given thought to this beforehand:
- Should dan, dain, daiiin to be homophones/allophones of zhì to some extent?
- What do you make of triple reduplicated daiin on f89r2 (picture below)?

   


Although not a partisan of East Asian Origin Theory personally, thank you for putting in the work and publishing your findings!  Smile


RE: The 'Chinese' Theory: For and Against - oshfdk - 11-02-2026

(10-02-2026, 11:46 PM)RadioFM Wrote: You are not allowed to view links. Register or Login to view.Wouldn't it be more productive to pick apart the study itself, rather than Stolfi's post hoc backstory?

There would be a most common word in most texts. If we take two longest sentences from two texts of similar sizes, it's not strange that each would contain the respective most common word of either text. It would be a very promising find if the count of the most common word in the longest sentence of both texts would be the same non trivial (>2) number, but in this case this didn't happen even after reinterpreting other similar words as daiin. So, I'm not really sure what to pick apart here.

EDIT: Let's look from a different perspective. Suppose the corresponding paragraph of the Voynich MS was a mangled phonetic representation of that Chinese recipe. I guess this would be immediately obvious after matching other repeating characters in this recipe, I think there should have been a few, and the report won't be "daiin = use", but will be a small dictionary, a couple of other phrase matches and likely the gold medal in the VMS studies. This didn't happen for some reason. What is the reason?


RE: The 'Chinese' Theory: For and Against - MHTamdgidi_(Behrooz) - 11-02-2026

@Jorge_Stolfi,

In my view, there are serious problems with your method, so I am offering my peer review.

You acknowledge in your article that a quire of four pages is “visibly missing,” but that is not operationalized in your calculations and conclusions, except for some statistical approximation likelihoods that can hardly count as evidence. The same goes for the likelihoods of (re)ordering of the paragraphs.

You have simply proceeded to show your statistics overlap between the two texts based on those assumptions. Your approximation method then takes over to dismiss their significance (p. 4). If this, if that, if that, then that. I can see that as being grounds for plausible theorizations, but to claim more seems unreasonable.

360 and 365 are not unique, odd numbers. Either astrologically or astronomically, you can plausibly find literature across cultures in which something was said about them with a paragraph devoted to them, because of Zodiac month counts or days of the year. And if you search and compare enough, you may find statistical matches that may even be more coincidental. This is what coincidences do and 360/365 cannot serve as a smoking gun for a discovery, I am sorry to say. You are counting to your results because it fits it.

But we don’t even know the author meant 360 or 365, less or more. So, your “ifs” can come to the rescue when evidence is lacking. So, 10-paragraph introduction” (p.4) can come to the rescue (p. 4), “incorrect ordering” can come to the rescue, and so on.

Sorry, for not being clear about that question. My question about “enormous graphics” was simply that in the Voynich manuscript there is an enormous number of graphics/images. The Chinese text you are comparing the last section with is only text-based. I was asking, you expect us to assume somehow such a text would be able to explain and inspire the extent and amount of images we find in the Voynich manuscript in such graphic detail?

You are offering a “solution” for the section of the Voynich manuscript and then expect that it would explain not just other section texts, but the enormous amount of images found there? Seem implausible to me at its face value.

In the linked sources you referred me to, much of it is conjunctural, again, based on “ifs” because you have relied on a text no one can read which then serves you to compare with a language (Chinese) you don’t speak yourself to establish statistical patterns. You write, “To split imperfect blocks, we set a 'definitive' parag break after every short line, even if the next line cannot get a starlet assigned to it [my note: so, you decided to give starlets to them using your AFAIK? logic?!]. That line will be the start of an 'unstarred' parag. We also put a 'tentative' parg break before any line that has at least one puff, even if the previous line is not short and no starlet can be assigned to it [my note: This sounds like a AFAIK logic again, to me]. Those two decisions divide each imperfect block into 'tentative parags'.” (You are not allowed to view links. Register or Login to view.) This just illustrates the tentativeness of how you have divided the paragraphs and then wish to draw definitive statistical conclusions from it for a definitive discovery claim.

Regarding your traveler theory, you say someone read the text to him and he noted it down in his invented writing, not understanding it (as you say, because there was no one to translate it) his intention being “when I get back I will find a Chinese speaker to understand what’s in this piece others say is magnificent, but I can’t understand.”

What I was saying was, why not ask a Chinese scribe to copy it IN CHINESE, and bring THAT home? Is that harder than inventing a whole alphabet and trying to figure out what the reader is reading, any mistakes or mis-hearings of the delicate words withstanding? If he could travel there, he could certainly find a way of doing that, no? How far can we go to invent even new odd justifications for making such an unreasonable propositions plausible (i.e., lost money traveling?! Seriously?)


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 11-02-2026

(10-02-2026, 11:46 PM)RadioFM Wrote: You are not allowed to view links. Register or Login to view.Have you made any attempts to match 治 (zhì) -- or whatever token is most common after 主 (zhû) -- with vords after daiin? You mention spelling variations could be at play in the VMS, but is there a 2-vord digram family that could tentatively match 主治 in the Shennong Bencaojin? Doesn't the VMS display greater unpredictability in terms of the following word after daiin?

No, I did not look (much) for such specific correspondences. They may no be there. 

First, the SPS may be derived from a translation of the SBJ into some other East Asian language, not from the Chinese original.  In English translations of the SBJ, the compound 主治 is translated in various ways, using from one word ("indications" or "uses") to three or more ("used mainly for", "its main uses are", etc.).  Translations to other Chinese "dialects" may consistently use the same two syllables, but translations into unrelated languages like Burmese, Thai, Tibetan, or Vietnamese may used one word, or several different words, like in English.  So it is quite possible that daiin means 主治 rather than just 主.

Also there are phonetic phenomena that may confuse the issue.  In modern Mandarin, when two syllables are spoken in succession, the tone of the second can be modified depending of the tone of the first.  (Compare with the alternation of English "a" and "an" depending on the next word, or the optional omission of final "e" in Italian verbs ("cantare->cantar") for complicated reasons.) 

Quote:Should dandaindaiiin to be homophones/allophones of zhì to some extent?

You mean of 主 zhǔ?  The syllables "zhu", "zhù", "zhú", "zhǔ" have completely unrelated meanings.

But an important point is that the pronunciation of Mandarin has changed quite a lot in the last 600 years, to the point that poems that used to rhyme no longer do.  That is a consequence of a totally non-phonetic script.  (In contrast, the pronunciation of Italian seems to have changed little since the 1200s, because the near-phonetic spelling and the classics like the Divina Commedia effectively anchored it.)  And the Chinese "dialects", like Cantonese, have radically different pronunciations for the same Chinese characters.  In fact it seems that there is doubt whether in certain past epochs the people in certain areas spoke Mandarin or some other dialect.

That is to say that merely identifying daiin with 主治 or 主, and the SBS as a version of the SBJ,  still leaves us far from identifying the underlying language and the spelling system.

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - ReneZ - 11-02-2026

As an immediate reaction, without having read any of the background reasoning, I would call this 'interesting'.

I'm not a great fan of the 'Chinese theory', but I fully understand where it is coming from.
Contrary to many (most?) proposed solution ideas, this is actually based on observations and some evidence.

I gather that the proposed correspondence is with the concept zhu3, not the sound. This will make it difficult to find/match related words.

I think it is a bit early to consider this 'proven', which I believe was announced.


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 11-02-2026

(11-02-2026, 12:00 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.There would be a most common word in most texts. If we take two longest sentences from two texts of similar sizes, it's not strange that each would contain the respective most common word of either text. It would be a very promising find if the count of the most common word in the longest sentence of both texts would be the same non trivial (>2) number, but in this case this didn't happen even after reinterpreting other similar words as daiin.

It is not just that the number of occurrences of  daiin (5) or daiin|dair|laiin (8) in the longest SPS parag is similar to the number of occurrences of 主治 (6) or 主 (7) in the longest SBJ recipe.  The main point is that the relative positions of those occurrences, with are quite irregularly spaced, are remarkably similar in the two parags.  The five occurrences of daiin match five of the seven occurrences of 主, while the eight occurrences of daiin|dair|laiin match all seven, with one left over.  

As noted in the paper, there is a constant shift of 0.15 μ in all SBJ positions relative to the SPS ones, but in retrospect that shift has a simple explanation: the SPS version of that recipe clearly omits the "taste and warmth" field of the SBJ recipe (the 4 Chinese characters 味甘微温).

It may be that the version that the Author used as source had omitted that field, or maybe he felt that it was unnecessary.  But it is quite possible that those 4 characters were not there in 1400.  According to Wikipedia and other sources, the original text of the SBJ has been lost, and what we have today is a tentative reconstruction by various authors up to the 1600s.  So maybe that field of the recipe was "hallucinated" by those Chinese Frankensteins, by analogy with other entries.  Come to think of it, it does not make sense in that recipe, because it is actually about several remedies from various parts of the "Rooster" -- including its "white excrement".  Did its head and intestines really taste like its poo?  Hm...

Quote:Suppose the corresponding paragraph of the Voynich MS was a mangled phonetic representation of that Chinese recipe. I guess this would be immediately obvious after matching other repeating characters in this recipe, I think there should have been a few, and the report won't be "daiin = use", but will be a small dictionary, a couple of other phrase matches. This didn't happen for some reason. What is the reason?

See my reply to @RadioFM above.  

I honestly thought that the figure on page 8 was enough for a first report.   But it seems that some folks will never be satisfied...

Quote:likely the gold medal in the VMS studies

I am happy to leave the silver and bronze medals to others.  Big Grin

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - JoJo_Jost - 11-02-2026

Hi Stolfi,

The approach is interesting, but there's something I don't understand. If this Zhu is found, shouldn't the other Chinese characters also have a counterpart in the Voynich manuscript?

I have marked the words that appear twice or more with the respective colors. However, I don't see any consistent correspondence in the Voynich manuscript that seems to fit.

Am I making a mistake?

   


RE: The 'Chinese' Theory: For and Against - Antonio García Jiménez - 11-02-2026

This outlandish theory makes a good plot for a novel or a film script. I'm surprised that anyone gives this any credence. What surprises me most is that René, a Germanic mind with a solid Kantian background, finds it interesting.

I think we're all going to go crazy


RE: The 'Chinese' Theory: For and Against - DG97EEB - 11-02-2026

(11-02-2026, 09:50 AM)Antonio García Jiménez Wrote: You are not allowed to view links. Register or Login to view.This outlandish theory makes a good plot for a novel or a film script. I'm surprised that anyone gives this any credence. What surprises me most is that René, a Germanic mind with a solid Kantian background, finds it interesting.

I think we're all going to go crazy

Antonio, science is science. Professor Stolfi is making falsifiable predictions. We can all run models and understand his code and draw conclusions on whether the numbers stack up, agnostic of any back story. I think you should also reflect on Professor Stolfi's academic credentials and contributions to this field over many decades and show him the respect he deserves. He can fight his own battles I'm sure, and I'm certainly not advocating an argument from authority, but if you want your own ideas to be taken seriously, perhaps some mutual respect might be advised.