![]() |
|
The 'Chinese' Theory: For and Against - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html) +--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html) |
RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026 dashstofsk, I'm obviously inclined to your conclusion, but does it account for the homonyms Jorge has identified? RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026 (17-02-2026, 01:27 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The problem is that there is no such thing as "the SBJ sensu stricto". Forget Chinese and the SPS for a moment, and suppose instead that someone finds an "Alchemist Herbal" in Italian (AHI) which has about the same number of plants as the Herbal-A section of the VMS (VHA). He wonders whether VHA may be a version of AHI. He notes the following things:
(The numbers for the VHA are totally made up, but pretend they are true.) However, the AHI entry has 10 words before the first A, while the VHA entry has only 2 words before the first V. And the AHI entry has 19 words after the last A, while the VHA entry has only 6. Are these discrepancies reason enough to reject the claim that the VHA is a version of the AHI? Further research shows that there are several versions of the AHI out there, which differ in many details -- including whole sentences or whole entries being added or deleted. (Check You are not allowed to view links. Register or Login to view. for an actual example of this variation.) I am sure that any palographer would see the coincidences above as very strong evidence that the VHA is a version of the AHI. After all, the initial decipherments of the Egyptian hieroglyphs, Hittite, Linear B, etc. were based on much more limited evidence. Quote: here is entry 97 from the middle section: 主治寒濕風痺,黃疸。I've chosen this because it is short, but it gets the essential point across. Classical Chinese is topic-comment, which means the first part of a clause tells us what is being talked about. The first part of the clause here is "主治", ... Another use is then juxtaposed, "for jaundice", 黃疸. Crucially, you cannot omit "主治", as "寒濕風痺,黃疸," has a different topic. Well, grammatical or not, that is what the SBJ text is. In general, for each remedy here is only one 主治, followed by a list of diseases, conditions, benefits, etc. The SBJ is not meant to be a literary text, like Culpeper's Herbal (which may have been so popular precisely because 75% of the text is entertaining chitchat). It is more like a catalog of drill bits, or a list of ships and their cargos. But anyway I don't think your claim is correct. I don't think in Chinese you must repeat the topic before every thing you say about it. That is precisely why it is called "topic", not "subject": once you state it, any further statements refer to it by default. AFAIK Chinese does not have a word for "and", so even in a list of just two things they are just written one after the other. (Instead of "I saw Joe, Jack, Mary, Lou, and Jeff", it would be "I saw Joe, Jack, Mary, Lou, Jeff". Not a big difference...) And there is no special marking for topics (like there is in Japanese), and not every new phrase is a topic for what follows. The parsing of something as "topic" or "comment on the previous topic" is based on the semantics, not on grammar. Quote:provided the four characters 寒濕風痺 are still read as a unit, they become the new topic, and jaundice becomes the new comment. The whole phrase is a single disease, "motion impediment caused by cold, dampness, or wind"; literally "cold-pain dampness wind impediment", that is probably "rheumatism" . It does not become a new topic in part because the next phrase 黃疸 "jaundice" (literally "yellow skin-disease") cannot be a comment on "rheumatism", so it is naturally parsed as another use. Even without the comma (which I doubt was used in the 1400s). Quote:If you think it is categorically impossible to definitively identify the text, then I suggest you withdraw your claim of a definitive match in the conclusion of your write-up and recognize why that is drawing so much scrutiny. If you believe you have definitively identified the text, it is not enough to give an explanation for why there are substantial problems with the match, you need to give the explanation, and strong reasons to accept that as the definitive reason. See above. In this sort of claim, "book" does not mean a specific sequence of words, but an open collection of texts that are obviously versions of some original, even if they have lots of differences. The Shennong Bencaojin is not special. Every old "book", like the Dioscorides Herbal, the books by "Hippocrates", and Euclid's Elements, is a "fuzzy text" in that sense. Part of the Dead Sea Scrolls fragments (~100 BCE) have been firmly identified with the Hebrew Bible, even though there are altogether thousands of differences between them and the next oldest version of the latter (the Masoretic version of ~1000 CE). Explanations may have been offered for some of those differences, but I doubt that any of them came with the "strong reasons" you demand. Neither the explanations nor such reasons were necessary for the identification to be accepted as certain. Quote:The apparent omissions arise as part of your process for identifying the text, which means an explanation is that they are because of flaws in that process. I don't get the point. Obviously explanations are needed only because I identified the SPS text as the SBJ minus three sentences. And I provided plausible explanations. Again, the first omitted item would have been 味甘微温 = "sweet and slightly warm". That field, specifying the taste and "warmth" (in the sense of Chinese medical theory) of the remedy is the first field in most recipes. But for this particular recipe it makes no sense. The recipe is about seven separate products extracted from the rooster, from quills to the white part of the poo, and it is impossible that they all have the same taste and "warmth". This entry may well have been an addition by the scholars who reconstructed the SBJ in the 1600s -- and added it only because "every entry must have had it". The second omitted sentence, the next-to-last field of the recipe, lists two veterinary uses of "rooster" (not clear which part, maybe the eggs): 鸡白蠹,肥猪 = "to fight chicken parasites and fatten pigs" (literally "chicken white worm, fat pig"). But the SBJ generally is not concerned with veterinary medicine. I have seen claims that this sentence was probably a marginal note added by some farmer or veterinarian to his copy of the SBJ, which then got copied as part of the text. I gather that this sort of thing happened quite often with European manuscripts. And the third omitted sentence was the last field 生平泽 = "grows in marshlands". Again it does not make sense for "rooster", since it is a domestic animal that is raised everywhere. I have seen claims by scholars that, like the "taste" field, this field was added later "because every recipe must have had it". So, it is quite plausible that those three sentences either were not present in the version of the SBJ that was the source of the SPS, or they were omitted by the Author because he they made no sense for him, or he did not care for them. All the best, --stolfi RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026 There are a number of misapprehensions here about the text you are analyzing.
Can you source for me the claim that this makes no sense: "Again, the first omitted item would have been 味甘微温 = "sweet and slightly warm". That field, specifying the taste and "warmth" (in the sense of Chinese medical theory) of the remedy is the first field in most recipes. But for this particular recipe it makes no sense. The recipe is about seven separate products extracted from the rooster, from quills to the white part of the poo, and it is impossible that they all have the same taste and "warmth". This entry may well have been an addition by the scholars who reconstructed the SBJ in the 1600s -- and added it only because "every entry must have had it"." This doesn't ring immediately true from my first-hand experiences with Korean traditional medicine, which is that the animal in whole and in part, has the same qualities throughout, but that is a different system and I'm not 100% sure that impression is true anyway. RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026 (17-02-2026, 03:42 PM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.But in the SBJ the variability of the 'zhǔzhì' word ( 主 治 ) is -0.42. ( Using your bencao-fu.pyj as my source, and which gives 'zhǔ zhì' as one word and not as two words. ) Let me address first the part of your message that is about the SBJ only. I'll leave the parts about the SPS for another message. The joining of Chinese syllables into two- or three-syllable compounds is a thing that exists only in pinyin, mean to help translation of the text into Western languages, which operates with the compounds rather than isolated syllables. In the Chinese script the characters are written one after another without any spaces or grouping into compounds. Native speakers will "know" when two or more characters have a specialized meaning that is not implied by them in isolation. Just like an English speaker knows that "pen drive" is not how sharpened quills commute to work. Thus you should use the file "in/bencao-fu.utf" (Chinese characters in Unicode UTF-8) as your SBJ source. If you are doing the analysis in python3, you can read that file by saying "stdin.reconfigure(encoding='utf-8')" at the beginning (or for any input file instead of "stdin"). Then each Chinese character will be read as a single string character. Then, if your python3 source file is in UTF-8 too, you can ask "if ch == '主': ..." or "if line[i:i+2] == '主 治': ..." or "if re.search(r'主 治', line):...". And you can remove Chinese punctuation with "line = re.sub(r'[ 。,]', '', line)" (Beware that the first character inside the brackets is Unicode's IDEOGRAPHIC SPACE, which this 屎 forum editor sometimes decides to delete.) Now, if you look at that "rooster" recipe, it can be broken down into the following fields: The translations should be taken with a metric ton of salt (Google translate will say "burns", "cold sores", "fever", etc. for the same character, depending on what is around it.) As you can see, the SBJ is not consistently using 主 治 = "main use(s)" as the keyword to start a list of uses. In two sub-recipes it says just 主 = "mainly [for?]". Presumably because what follows the key is a "verb-like" character ("kills", "drains") rather than a "noun-like" one like in the other cases. Thus counting just 主 治 is not right. And in the sub-recipe J the SBJ uses neither; perhaps because that entry starts with "can ..." Advancing on the next message, the VMS column on lines C-I shows the words that I propose correspond to 主 治 and/or 主 in the recipe. The "translation" of both seems to be daiin or a "misspelling" thereof. The "-" means that the word appears as a suffix rather that a word by itself in my transcription file. In the paper I did not propose any VMS key for line J, since the SBJ has no 主 there; but in fact there is an lkaiin apparently at the right spot. All the best, --stolfi RE: The 'Chinese' Theory: For and Against - dashstofsk - 17-02-2026 (17-02-2026, 04:43 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.does it account for the homonyms Jorge has identified? There are not enough occurrences of dair laiin yaiin to normalise the distribution of daiin. RE: The 'Chinese' Theory: For and Against - Yavernoxia - 17-02-2026 (17-02-2026, 05:43 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.(17-02-2026, 01:27 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The problem is that there is no such thing as "the SBJ sensu stricto". I think there’s a basic difference here. In the example you mentioned, we’re still talking about a European language, which is exactly what the paleography seems to point to. The manuscript looks European, shows European influences, uses a European writing style, and, as far as we know, has only ever had European owners. On top of that, this fits neatly with the carbon dating and with what most VMS scholars (Lisa, etc...) already think. So it wouldn’t be surprising if, sooner or later, someone identified a European “source” text, for example an Italian one, as you suggested. Saying instead that the source was Chinese is a much bigger leap. That would mean rethinking a lot of assumptions and coming up with a whole chain of explanations, coincidences, new discoveries, and, above all, solid proof. At the moment, that scenario just seems much less likely to me. RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026 (17-02-2026, 08:55 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.As you can see, the SBJ is not consistently using 主 治 = "main use(s)" as the keyword to start a list of uses. In two sub-recipes it says just 主 = "mainly [for?]". Presumably because what follows the key is a "verb-like" character ("kills", "drains") rather than a "noun-like" one like in the other cases. Thus counting just 主 治 is not right. And in the sub-recipe J the SBJ uses neither; perhaps because that entry starts with "can ..." They are all verbs. The nominal translation of "治" as "use" is somewhat misleading. You are not allowed to view links. Register or Login to view. gives only the verbal glosses of "govern, regulate, administer". This is usually a pretty good guide to usage---治 mostly has a verbal sense---but Literary Chinese is quite flexible about parts of speech even before adding in translation flexibility, so "governance, regulation, and administration" are all possible readings depending on context, to say nothing of how artless an English translation of "Principally governs" might be. The other two are also verbs in context. The compound with 主 is common enough to get its own entry You are not allowed to view links. Register or Login to view., but the fact 主下 doesn't get one doesn't mean the usage isn't constructed exactly in parallel; it is, and the apparently missing "治" is because those phrases have a different verb. RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026 (17-02-2026, 02:56 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.taking recourse to apparent homophony between daiin, dair and laiir should increase, rather than decrease, the entropy, because allowing additional spellings should make the bigrams less, rather than more, predictable, raising the entropy. Basically I am proposing that there is a non-negligible amount of misspellings in the transcription files, and in the manuscript itself. Is that hard to believe? Maybe those changes in the "rooster" are not misspellings, but inflections, tone sandhi, or other normal linguistic phenomenon. But let's say they are misspellings for this question. Based on that tiny tiny sample, it would seem that there is 14% probability of initial d being changed to l, and 14% probability that final in changed to r. How much effect could those "random" changes have on the per-character entropy? In my word model, d and l are both members of the "dealers" set D, together with r and s. So at that initial slot, before the errors, there was already some prob distr between those four characters. If 14% of the d are mapped to l by scribal or dictation error, that may increase or decrease the entropy depending on whether the distr becomes more uneven or more even. I am too lazy to get the true numbers now, but suppose that the probs without errors are d=1/2, l=1/4, r=1/8, s=1/8. Then that slot would contribute 1/2+2/4+3/8+3/8 = 10/8 = 1.25 bits of entropy to the word entropy. Noy suppose that the error rate is such that half of the d become l. Then the probs would become d=1/4, l=1/2, r=1/8, s=1/8. Surprise: the entropy of that distribution is still 1.25 bits, so the word entropy would not change. (But of course now a fraction of that entropy is no longer meaningful information, it has been replaced by noise.) So you may see that errors may raise or lower the entropy, as computed from the word frequencies. They will always destroy meaningful information, but need not replace it with noise. Same for the hypothetical mutation of the ending in to r in daiin ⟶ dair In fact, when I developed my crust-mantle-core word model (which is actually a "seven layer" model), I defined the set N of "coda" elements as being 113.500000 0.00091 {n} 868.250000 0.00697 {m} 1665.500000 0.01336 {in} 40.000000 0.00032 {im} 3779.000000 0.03032 {iin} 15.000000 0.00012 {iim} 159.000000 0.00128 {iiin} 1.0 0.00001 {iiim} 487.750000 0.00391 {ir} 130.500000 0.00105 {iir} 1.0 0.00001 {iiir} (The two numbers are counts (fractional because of commas) and relative frequencies among all elements.) I was unhappy with those codas ending in r, because I already had r as a member of the class D, and I could not let r by itself be a coda, like n or m. But that was before I looked at that "rooster" recipe. Now I have the strong suspicion that all the endings ir and iir are quillos ("typos, but with a quill") for iin and iiin, respctively; and that a fraction of the words that end in r should actually end with in... All the best, --stolfi RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026 (17-02-2026, 10:12 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Noy suppose that the error rate is such that half of the d become l. Then the probs would become d=1/4, l=1/2, r=1/8, s=1/8. Surprise: the entropy of that distribution is still 1.25 bits, so the word entropy would not change. (But of course now a fraction of that entropy is no longer meaningful information, it has been replaced by noise.) Putting this another way, then, as long as the same phonotactical *distribution* is maintained, these errors could well cancel each other out. So it, at least, isn't categorically at odds with the Chinese Theory. I will note, though, that this is a pretty narrow path; if the distribution is disturbed---such as if the error rate were either higher or lower, or into other dimensions---the entropy would change. Occam's razor cuts against this explanation, which isn't to say its impossible, but it is to say it's another highly contingent interpretation buttressing your theory. My other points, about how the destruction of information quickly renders the identification unfalsifiable, at least under what you've argued so far, stand. RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026 (17-02-2026, 09:18 PM)Yavernoxia Wrote: You are not allowed to view links. Register or Login to view.as far as we know, [the VMS] has only ever had European owners We do not know that. We only know its whereabouts and its owners since ~1600, which is almost 200 years after the C14 date for the vellum. And that is assuming that Jacobus's signature is not a librarian's mistake or a forgery by Wilfrid, hypotheses that I think are rather unlikely but far from impossible. And assuming that the VMS is indeed the book that Baresch and Marci discussed with Kircher; which ditto ditto. Either way, we do not know where the VMS was or who owned it between 1400 and 1600. The fact that it was in Europe after that time would make the European Language theory a bit more likely than ts alternatives; but that a priori advantage has vanished by now. Quote:In the example you mentioned, we’re still talking about a European language, which is exactly what the paleography seems to point to. The manuscript looks European, shows European influences, uses a European writing style ... . On top of that, this fits neatly with the carbon dating and with what most VMS scholars (Lisa, etc...) already think. Yes, yes, yes, I know... Over 100 years of intense scrutiny by the best cryptographers and paleographers have failed to produce even a tiny crumb of result. The reason is that everybody who stood a chance of solving the riddle -- meaning, anybody who knew what a medieval manuscript was -- made the same stupid mistake. They could see that the material is European, the writing instrument is European, the ink is European (well, it should be) the character shapes look European, the writing direction and the formatting into paragraphs is European, the Zodiac sign icons are European, the nymph dresses, hats, hairdos are European, the castles and T-O maps and cloudbands are European, the wormholes look European, the goulash stains look European, the letters of the marginal scribbles look European... ..and so they concluded that, therefore, obviously, the Author must have been European, the language must be European, and the contents must be a product of European culture. But that is a gross logical error. A non-sequitur. The conclusion does not follow from the premise. You cannot put a "therefore" there. All those "European" material and writing details are superficial. Those "European" features in the illustrations are decoration, not contents. They do not say anything about the actual contents or the language. At most, they suggest that the Scribe who provided them was European. (But even that is now looking less and less likely. But we can leave this point for another message.) In fact, those details that are likely to be part of the contents do not look European at all. Like the fact that the Zodiac starts with Pisces and is divided into 12 sets of 30 things (or 24 of 15), and each of those 360 things has a distinct short name. The structure of the diagrams in Cosmo. The anatomical drawings in Bio. And, above all, the text -- which absolutely does not look like an European language. Probability theory says that a rational person should modify whatever prior probabilities he had for competing hypotheses according to whether the consequences predicted by those hypotheses are seen to occur or not. The European Language hypothesis made various specific predictions about word statistics and patterns, that all failed. But most importantly it predicted that 100 years of efforts by the best cryptographers in the world would produce at least some result. That prediction has thoroughly failed. On the other hand, the Chinese Language theory made several predictions about word statistics and patterns that were generally verified. And, most importantly, it predicted that anyone who tried to crack the enigma by starting from the assumption that the language was European (or Semitic, Turkic, Dravidian, whatever) would of course fail. A prediction that has been extensively verified. So, please, people, isn't it time to admit that the "therefore" above was wrong? All the best, --stolfi |