The Voynich Ninja
The 'Chinese' Theory: For and Against - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Theories & Solutions (https://www.voynich.ninja/forum-58.html)
+--- Thread: The 'Chinese' Theory: For and Against (/thread-4746.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 16-02-2026

(16-02-2026, 11:34 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view."if the SPS is a version of the SBJ, then each Voynichese word in the former corresponds roughly to ... 1.089 [Chinese characters]."That implies a bound on how undersegmented the Voynichese can be. If "...the Author often missed word breaks when taking dictation....", then according to this claim he/she didn't miss more than ~8% of them. ... As I pointed out in my earlier post, treating uncertain spaces as spaces results in only 129 of the occurences of 'daiin' being as a word, with all but 8 of the remaining 177 occurences being as a word suffix or prefix. Which implies a far higher rate of undersegmentation if we assume 'daiin' is typical.

There are many confusing facts to account for.

First, just to clarify, the word counts in the paper consider commas as word spaces, like "." and "-" (line break).

Second, we don't know the language and what sort translation the source book was.  If the Author was in, say, a Cantonese-speaking area, we may hope that the Chinese characters were basically those of the digital file, and Dictator read each Chinese character as one Cantonese syllable.  But if the thing happened in Vietnam, the text as read by the Dictator may have not been one-Vietnamese-syllable-for-one-Chinese-character-in-my-file, because the grammar of the two languages is completely different.  However the close match of the word counts makes this possibility rather unlikely.

Moreover, the SPS version of the "Rooster" recipe almost certainly omitted the "taste and warmth" field of the Chinese version, and the "veterinary uses" and "where it grows" fields at the end.  If similar omissions occur in many other recipes too, they will affect the ratio above (average words per parag / average chars per recipe) Thus the actual correspondence of VMS words per Chines chars may be closer to 1 than to 1.08.

On the other hand, that 8% is the net discrepancy in the word count.  The transcription of the SPS may have bogus spaces as well as missing ones. It could be 18% missing word spaces and 10% bogus ones.  For example, in other SPS parags one finds "otchod.aiin" and "dalchd.aiin" that seem to be improperly joined and split "daiin".

Anyone who has tried to transcribe the VMS knows that the spaces marked with comma are only a subset of the dubious spaces, and that many glyph spaces that do not seem dubious at all may in fact be missing word spaces, or vice-versa.

Quote:Alan Farne's thesis ... "There were two surprising conclusions from the analysis of these manuscripts: first, the scribes of both direct copies neither added nor omitted any words."

But those scribes were copying from other clean-copy manuscripts written by other scribes for sale or commission, where word spaces were clearly distinct from glyph spaces (otherwise they would necessarily have made many wrong calls when trying to determine whether there was a space of not).

And one must wonder how ignorant those Scribes were of Greek, really.  They were not fluent, sure, seeing that they made many spelling errors.  But maybe they knew enough to parse words even if the spaces were ambiguous?  Greek used to have special diacritics on initial vowels, and a special form for final sigma, and things like that...

Whereas the VMS Scribe was copying from the Author's draft. Even if it was a second draft, not the raw dictation, it would probably have been much more loose with spacing.

Quote:And requires an explanation if we are asked to believe it is not.

I am not asking people to believe.  I am presenting the evidence as best as I can so that they 
can draw their own conclusions

To me, the conclusion "SPS≈SBJ, daiin≈主" much more certain and well-founded than, say, the 1400s date, or the Five Scribes theory.  But I don't need others to believe that.  I am not posting here to boost my TikTok views or promote some book.  If some people do not agree now, well, maybe they will later, as I get more matches.

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - rikforto - 16-02-2026

(16-02-2026, 01:34 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Second, we don't know the language and what sort translation the source book was.  If the Author was in, say, a Cantonese-speaking area, we may hope that the Chinese characters were basically those of the digital file, and Dictator read each Chinese character as one Cantonese syllable.  But if the thing happened in Vietnam, the text as read by the Dictator may have not been one-Vietnamese-syllable-for-one-Chinese-character-in-my-file, because the grammar of the two languages is completely different.  However the close match of the word counts makes this possibility rather unlikely.

Moreover, the SPS version of the "Rooster" recipe almost certainly omitted the "taste and warmth" field of the Chinese version, and the "veterinary uses" and "where it grows" fields at the end.  If similar omissions occur in many other recipes too, they will affect the ratio above (average words per parag / average chars per recipe) Thus the actual correspondence of VMS words per Chines chars may be closer to 1 than to 1.08.

I am going to reiterate some issues I raised previously:
  1. Formally, this is a problem for claiming a match. Insofar as this is what your paper proves, it does not show that the SPS matches the SBJ; it shows the SPS matches an unknown plaintext which itself does not match the SBJ. Having to posit that the plaintext is mangled beyond matching the SBJ to explain how only a few words appear to line up is tantamount to saying it does not match the SBJ. It is not enough to establish plausibility here; I think all of your critics grant the plaintext could be something other than the SBJ sensu stricto. It is also plausible that the reason for the extremely contingent and narrow alignment your paper identifies is that they are not, in fact, related texts and the alignment is mere coincidence. What proof do you have that allows us to actually reject the null hypothesis that the SBJ and SPS do not match?
  2. With that question in mind, there are substantive reasons I expect that we can't, beyond the difficulties of formal proof. All the languages you mention are topic-comment languages, which means that the topic cannot be omitted without altering the meaning of the text substantially. To a point, I grant that the translation could have been fairly "free", supplying a new or synonymous topic, and hence the question of point 1 is how to know it. But it bears saying that even if you persuade me that's what happened, that is not the simpler explanation. The simpler explanation is that the topic-comment structure, as well as the relentless parallelism, of the original was preserved in translation to any of the posited topic-comment languages. So if I'm being asked to evaluate how plausible I find it that the SPS is the SBJ in translation without rejecting the null hypothesis, I don't find that to be a particularly natural assumption about how the text came to look the way it does.

For my part, I would be much more persuaded if we moved beyond discussing potential scenarios that could explain the discrepancy you acknowledge and focused on what does explain the difference.


RE: The 'Chinese' Theory: For and Against - Typpi - 16-02-2026

(16-02-2026, 08:04 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.Having to posit that the plaintext is mangled beyond matching the SBJ to explain how only a few words appear to line up is tantamount to saying it does not match the SBJ. 

I agree with this assessment, too much freedom, fit isn't good without some type of modification.

I'm guessing since my last 2 questions got ignored that I'm correctly reading how this theory works and not missing something big?

It seems like we're missing the forest for the trees with these statistics, when the answer is a lot more obvious...  Huh


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026

(16-02-2026, 08:04 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.Formally, this is a problem for claiming a match. Insofar as this is what your paper proves, it does not show that the SPS matches the SBJ; it shows the SPS matches an unknown plaintext which itself does not match the SBJ. ... I think all of your critics grant the plaintext could be something other than the SBJ sensu stricto.

The problem is that there is no such thing as "the SBJ sensu stricto".  While the book has almost always been important in Chinese medicine, from what I read it seems that there is no surviving copy older than 1600 or so, and the existing versions are later reconstructions from incomplete fragments.

It is like the situation of several classics from Western Antiquity. When one says 'the Dioscorides herbal", "the Alchemists herbal", or "the Epic of Gilgamesh", one is actually speaking of a collection of variant texts, including translations in other languages.  Actually the situation of the SBJ seems to be quite a bit worse than that of Dioscorides

Quote:All the languages you mention are topic-comment languages, which means that the topic cannot be omitted without altering the meaning of the text substantially. To a point, I grant that the translation could have been fairly "free", supplying a new or synonymous topic, and hence the question of point 1 is how to know it. ... The simpler explanation is that the topic-comment structure, as well as the relentless parallelism, of the original was preserved in translation to any of the posited topic-comment languages.

Sorry, I don't understand this point.  I agree that a translation into another language like Vietnamese or Tibetan, that had tried to produce text that was grammatically and stylistically good for this language, would have messed up the alignment considerably.  Like the translation of the SBJ into English does.  But, if the SPS is indeed a translation of the SBJ (as opposed to a reading in any of the Chinese "dialects"), it may be a stilted translation that tried to keep the Chinese word order and grammar, while translating only some Chinese syllables.

I attended a couple of Japanese Buddhist ceremonies where at some points the priest and/or he attendees are supposed to recite a prayer in Tibetan (text provided).  The prayer is indeed a string of isolated syllables that is totally unlike Japanese.  But I could see that it was not Tibetan either, since the phonetics and spelling were Japanese, again totally unlike Tibetan.  And probably many syllables had been completely changed/inserted/deleted, even way back when the prayer was imported into Japan.  This sort of thing may have happened with "translations" of the SBJ into other monosyllabic languages.

In Western languages, translation is a binary thing: a text is either definitely Latin or definitely German, never something in between. (There are a few so-called "macaronic" texts that mix two languages, usually Latin and a vernacular.  A medieval example is the the popular theatrical poem You are not allowed to view links. Register or Login to view.. Two recent delightful examples are the poems You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view.. But those were intentionally satirical or comical.)  However the nature of those East Asian monosyllabic languages allow a continuum between mere transliteration of the Chinese text -- one Chinese character to one syllable, which is some approximation in the local phonetics of the Chinese pronunciation -- to a proper free translation, with vocabulary,  word order, grammar, and style of the target language.  IF the SPS is a translation of the SBJ into Vietnamese, Burmese, Tibetan, or whatever, it seems to be closer to the former than to the latter.

Quote:For my part, I would be much more persuaded if we moved beyond discussing potential scenarios that could explain the discrepancy you acknowledge and focused on what does explain the difference.

I don't understand this either.  What discrepancy are you referring to?  The apparently missing texts at the beginning and the end of the SPS version?  I gave simple and satisfactory explanations for those omissions.

Any explanation for any discrepancy, in this case or in any other claim about the VMS by anyone else, would have to be conjectural.  Why is "ed" very common in some parts and absent in others? Why are p and f never followed by e, whereas CPh and CFh often are?  Why are there 30 labels in every Zodiac diagram,  never 31?  

If you demand explanations that are 100% certain, you'd better find yourself another hobby, because you will never get them here...

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026

(16-02-2026, 08:37 PM)Typpi Wrote: You are not allowed to view links. Register or Login to view.I agree with this assessment, too much freedom, fit isn't good without some type of modification.

Not "too much freedom", not at all.

Again, the longest entry in the SBJ is that "Rooster" recipe, which is actually seven separate sub-recipes for seven parts of the bird: (1) unspecified (presumably the meat), then (2) head, (3) fat, (4) intestines, (5) gizzard lining, (6) quills, (7) eggs.  Sub-recipes 1 and 3-6 start with 主治  = "main use(s)", entries 2 and 7 only with 主 = "mainly"  (then 杀鬼 = "kills demons" and 下血闭 = "drains blocked blood", respectively). Those are all the occurrences of 主 in that recipe.

The longest parag of the SPS has five occurrences of daiin (as words or suffixes), which are claimed to correspond to the starts of sub-recipes 1,2,4,5, and 7.  There is one occurrence of laiin (as a word), which is claimed to correspond to the start of sub-recipe 6. There are also two occurrences of dair; one (as a word) is claimed to correspond to the start of sub-recipe 3, the other (as a suffix) would be in the middle of sub-recipe 1, the longest one.

The evidence supporting those claims is that the distances between those words in the SPS are very close to the distances between the occurrences of  主 in that SBJ recipe.  This is true whether one counts words or EVA characters in the SPS, whether one considers only the occurrences of daiin (which give four distances) or those seven occurrences of daiin, dair, and laiin (which give six distances).  In spite of the fact that those distances vary by a more than a factor of three.   

See my reply to @dashtofsk for an estimate of how unlikely it is that those numerical matches are mere chance coincidences.  Even ignoring the other coincidences (both parags are the longest in their files, both words are the most common, the average parag lengths, etc.)

Quote:my last 2 questions got ignored

I am sorry, I haven't been able to keep up.  Let me check again...

(I could blame that oversight on one of the many "original features" of this MyBB forum software.  Suppose you check "Unread new posts" on the home page and see that there are 5 new ones on the Chinese thread. You read post 1 (the earliest one), and decide to reply. When you send the reply, the thread view is repositioned to show your reply at the end of the thread.  Unless you remember that there were more new posts, and manually back-scroll/back-page to them, you will miss posts 2-5.  And yet the software marks them as "read"...)

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - Jorge_Stolfi - 17-02-2026

(14-02-2026, 07:42 AM)Typpi Wrote: You are not allowed to view links. Register or Login to view.But why would only one word/phrase correspond? Shouldn't the words around it also match and have a similar structure?

Check the text of that recipe.  There are 69 distinct Chinese characters, but the ones that repeat,
with their general meanings, are

      7 主 = "main, mainly"
      5 治 = "use". 
      3 白 =  "white"
      3 鸡 = "chicken, fowl"
      2 下 = "low(er), bottom, under"
      2 中 = "middle, center"
      2 子 = "child"
      2 寒 = "chill, cold"
      2 杀 = "kill, abate, terminate"
      2 温 = "warm"
      2 热 = "hot, burn, fever"
      2 神 = "spirit, magic, god, supernatural"
      2 血 = blood

So, should there be repeated Voynichese words matching these repetitions?  

For 主 the answer is yes: its seven occurrences match five occurrences of daiin as word or suffix, one occurrence of laiin as word, and one occurrence of dair as word. Both are very close to daiin in "ink distance".

The five occurrences of 治 are all part of the compound 主治  = "main use(s)". Which may have been transcribed as just "mainly" hence just daiin. Or maybe some of the words that follow daiin in that parag are the Voynichese for 治.

The two occurrences of 寒 = "chill" occur next to each other in the condition 伤寒寒热 that Google translates as "typhoid fever with alternating chills and fever".  That is, 寒寒 could perhaps be translated as "shiver".  (Repetition is considered bad in English and students are taught to avoid it.  But it is a not uncommon device in Chinese to mean repeated action, intensification, and other things.  It is used in other languages too, like in Italian "piano piano" = "little by little".)  In the SPS parag, at about the same position, there are the words ytaiin.otair which again are very close to each other in "ink distance".

The character 鸡 jī = "chicken" occurs in the name of the recipe 丹雄鸡 dān xióng jī = "rooster" (literally "red male chicken") and in the compound 鸡子 jī zǐ = "egg" (literally "chicken child").  In the SPS, the first one would be poar.keeo; by position, the second one would be somewhere in CKhy.Sheo.qoeeo.lkaiin.Chs.  So 鸡 zǐ "chicken" in Voynichese could be eeo.

The character 神 shén = "god, magic, supernatural" occurs in the compound 通神 tōng shén which Google struggles to translate but may mean "clears the spirit" or something of the sort.  Its correspondent in the Voynichese text is not clear but may be somewhere  in  ol.otaiin.okeey.qokaiin.or. The other occurrence is in 神物 shén wù = "magical substance", and, by position, should be in oaiir.ain.okShey.   There are several possibilities, but maybe 神 is okee and okSh, the plume and ligature of the latter being due to pronunciation variation, like stress, tone sandhi, etc.

Of the two occurrences of 子 = "child", one is in the compound 鸡子 = "egg".  The other is in the compound 女子 that means "woman" (literally "woman child", that is "girl" or "young woman") in Mandarin.  But, according to Google translate, in Cantonese (the language spoken in Shanghai and HongKong) the term for "woman" is another compound, written 女人 (literally "woman person").  So if the Dictator was Cantonese, he probably read 女子 as 女人, but still read  鸡子 as 鸡子.  That is a possible explanation for why there does not seem to be the same Voynichese word repeated in those two places.

Of the three occurrences of 白 bái = "white", one is in the "veterinarian uses" sentence at the end, which apparently is missing in the SPS.  The other two are in the disease 赤白沃 , literally "red white irrigate", which Google translates as "leucorrhea"; and in 屎白 = "white part of the poo" (literally "excrement white").  So again the two compounds may have been different phrases in the dictation.  But, by position, the white part of chicken poo may be kaiShd in Voynichese.

And so on.  Some of the repeated characters seem to correspond to similar words of Voynichese, some don't.  But there are many possible reasons for the latter.

This recipe offers a few other possible cribs besides daiin.  For instance, I would bet that Sheeo means 头 = "head".  It occurs a couple dozen times in the SBJ, including as the name of a medicine 白头公 bái tóu gōng = "white-haired man" (?!?)  But that guess would have to be confirmed by analyzing other recipes.

Quote:Couldn't I find a ton of books that have the same "matching" word positions in a ton of different languages?

See again the computations in my response to @dashtofsk.  First you would have to find a book with 350 to 450 paragraphs, with about 30-35 words per parag on average.  Found one?  The List of Lunch Specials at the Vagharshapat Buddhist Monastery for the Year 1278? Will do.  Now check whether the most common word in that book, call it X, occurs at least seven times in its longest paragraph.  Check?  Now pair up seven of those occurrences with seven of the eight occurrences of D = daiin|dair|laiin in the longest parag of the SPS, namely f105v.32-38, any way you wish; and count the words between those occurrences of X.  Do they match the counts between the corresponding occurrences of D in that parag?

Well, if they do, congratulations: you just discovered that the SPS is a lunch menu, and the Voynichese language is probably 13th century Armenian, which had to securely encrypted because one of the few sources of entertainment in the monastery was betting on what would be on the table on each day.  Or maybe Georgian, if the cook happened to be a Kyrgyz who moved to Vagharshapat after the Mongols destroyed the Buddhist orphanage at Zestaponi, where he had been living since his parents divorced over a dispute over horses and neither of them wanted to keep him, so they handed him over to a merchant who happened to pass through Özgön on his way home from Shangdu, who promised to take the boy, who reminded him of his nephew Marco, to some nice orphanage in Venice, but the boy ran away when they were crossing the Qvirila and was found wandering in the woods by a Mingrelian sheperd who had just converted to Buddhism and did odd jobs at the monastery in his spare time.

Quote: Especially if we're only matching one word based on spacing and ignoring the words around it?

As you see, besides 主 there are few repeated characters in that recipe that could be matched.  

Quote:You also said you didn't look through many Chinese books.. would this method work on other Chinese books or is this a unique case? Or have you not tested that yet?

There are several other old Chinese books on subjects that could have been used as source for the VMS.  In fact, I believe that the other sections of the VMS will be found to be other important Chinese books on astrology, anatomy, botany, etc.; and the language will be the same as that of the SPS.

But I don't know anything about those other books, much less how many parags they have and what is their average words per parag.  I knew that the SBJ had 365 recipes because that is always mentioned in any article about it.  I doubt that any other book comes anywhere close to that.

All the best, --stolfi


RE: The 'Chinese' Theory: For and Against - Typpi - 17-02-2026

Thanks Stolfi I understand it better now.

Appreciate the response.


RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026

(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The problem is that there is no such thing as "the SBJ sensu stricto".

This is indeed a problem! That is my point here, and I don't see how we can definitively identify the text if we don't even know what the text is, and I don't see an explanation of how you are overcoming this problem.

(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Sorry, I don't understand this point.  syllabic languages allow a continuum between mere transliteration of the Chinese text -- one Chinese character to one syllable, which is some approximation in the local phonetics of the Chinese pronunciation -- to a proper free translation, with vocabulary,  word order, grammar, and style of the target language.  IF the SPS is a translation of the SBJ into Vietnamese, Burmese, Tibetan, or whatever, it seems to be closer to the former than to the latter.

For illustrative purposes, here is entry 97 from the middle section:
主治寒濕風痺,黃疸。
I've chosen this because it is short, but it gets the essential point across. Classical Chinese is topic-comment, which means the first part of a clause tells us what is being talked about. The first part of the clause here is "主治", which is probably best translated as "primary indication" or "mainly governs", but if you'll allow a more labored translation that captures the semantics, it means, "As for its main use...". The next for characters, 寒濕風痺, establish the main use, namely "for weather-related arthralgia." (Full disclosure: I am relying entirely on the Chinese Text Project's glosses for the medical terms, why this means exactly this is all beyond my ken.) Another use is then juxtaposed, "for jaundice", 黃疸.

Crucially, you cannot omit "主治", as "寒濕風痺,黃疸," has a different topic. As mentioned, I am a little uncertain about the medical terms, but provided the four characters 寒濕風痺 are still read as a unit, they become the new topic, and jaundice becomes the new comment. Something in the ballpark of, "As for it being weather related arthralgia, it is jaundice." I would also be inclined to analyze a four-character string like that as a clause, but I cannot find a good candidate for a verb among the four. This may be my own shortcomings with the language, but zooming out from the particular example I think I have made my point. "主治" isn't omittable under the hypothesis that the SPS is essentially a Literary Chinese text, a fact which is true irrespective of the reading scheme. It shifts the semantics of the sentence away from use.

Aside from the aforementioned issues with identifying a text when we don't actually know what it is, positing that it exists in translation does not much help with this problem. Languages in the Mainland Southeast Asian Language Area are typically topic-comment, meaning a felicitous translation would naturally include the original topic and the translator would be expected to be sensitive to it. All of the languages you've mentioned are topic-comment---some, albeit, with marking---and the Southern Chinese ones you apparently think are the most likely would be the most inclined to preserve this construction in translation and be rendered illegible by omitting the topic. The observation that translation is not binary doesn't help here either, as compromises with the original would increase, not decrease, my expectation that that the topic is retained in translation. It will emerge momentarily that I find an explanation insufficient to claim definitive identification, but bear in mind as you read that, I also don't think this alternative is a particularly obvious or likely explanation considering the languages in play, and certainly not without further proof to that specific point.

(17-02-2026, 12:17 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Any explanation for any discrepancy, in this case or in any other claim about the VMS by anyone else, would have to be conjectural... If you demand explanations that are 100% certain, you'd better find yourself another hobby, because you will never get them here...

If you think it is categorically impossible to definitively identify the text, then I suggest you withdraw your claim of a definitive match in the conclusion of your write-up and recognize why that is drawing so much scrutiny. If you believe you have definitively identified the text, it is not enough to give an explanation for why there are substantial problems with the match, you need to give the explanation, and strong reasons to accept that as the definitive reason. (The "100% certain" formulation is yours, but "conjectural" is definitionally well below "definitive", I hope we can all agree!) The missing parts are not small---they are grammatically important and part of the information provided as part of the SBJ tradition. The apparent omissions arise as part of your process for identifying the text, which means an explanation is that they are because of flaws in that process. For the identification to be definitive, we have to be able to reject that explanation, not merely entertain speculation to the contrary. I've long held, despite some misgivings that we've discussed, that it is conceivable definitive proof may emerge for non-European or even Southeast Asian origins of the text. I don't think this is it, and more importantly, I don't yet see you making a definitive argument for it.

新年快乐, by the way!


RE: The 'Chinese' Theory: For and Against - rikforto - 17-02-2026

Also, as I look at how I might prove the negative here, I don't think the identification between 主 and multiple Voynichese spellings maintains the Chinese Theory? My understanding was that the idea was that particular statistics of Voynichese could be explained by linking them to features of whatever language the text ended up encoding. The texts that originally suggested the entropy might be low enough in East Asian Languages did so because features appear in a certain order. Pinyin, for instance, has initials, medials, vowels, medials, finals, and tone number. After the initials, the size of these categories are quite small, which would account for the relatively low bigram entropy. Other Romanizations, both in Chinese and other languages, have a similar quality and similar sources of entropy.

All else equal, taking recourse to apparent homophony between daiin, dair and yaiir should increase, rather than decrease, the entropy, because allowing additional spellings should make the bigrams less, rather than more, predictable, raising the entropy.

I cannot fully close the case with this as all things may not be equal, but short of identifying the source of that inequality, I do not see how you can maintain Voynichese homophonic spelling as a solution under the Chinese Theory?


RE: The 'Chinese' Theory: For and Against - dashstofsk - 17-02-2026

daiin cannot be 'zhǔ'. The outputs given below show that the probability of it being so is very very small.

In quire 20 the variability of the 3 character string  daiin when it appears either on its own or as part of a longer word is 8.79 standard deviation from what would be expected if the words were placed randomly. The variability of  daiin when it appears on its own is 2.94 standard deviations. ( I am using the GC transliteration and counting 101-&, 101-6, 101-7 and 101-8 characters as all being  d. And  iin is one character in GC. )

But in the SBJ the variability of the 'zhǔzhì' word ( 主 治 ) is -0.42. ( Using your bencao-fu.pyj as my source, and which gives 'zhǔ zhì' as one word and not as two words. )

This means that 'zhǔzhì' in SBJ is distributed within the bounds of what would be expected.  daiin, however, either on its own or as a part of a longer word is not distributed as expected.

You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.