15-04-2026, 11:11 PM
(15-04-2026, 02:52 PM)rikforto Wrote: You are not allowed to view links. Register or Login to view.Can you elucidate how you arrived at You are not allowed to view links. Register or Login to view. cribs beyond frequency?
For the first crib, 主 = daiin, I thought of comparing the longest entry in the SBJ, namely "red rooster", with the longest parag in the SPS (excluding parags that seemed to be two or more parags smashed together), namely f105v.32. The two were extreme isolated outliers in the entry/parag length distributions, so either f105v.32 was the match of the "rooster", or there would be no match. But fortunately f105v.32 the had survived the loss of the four pages and the big parag mash-ups.
The SBJ entry had 92 hanzi while the SPS parag had "only" 74 words. But the "rooster" entry is exceptional also in that it is 9 separate sub-entries, and the first 7 of these use 主. Comparing the entry to the parag I noticed that there were 5 occurrences of daiin in the latter that could be matched to 5 of those 7 主, in such a way that the gaps between them, scaled by the right factor, were remarkably similar. And, moreover, the other two 主 could be paired to a laiin and a dair, with equally consistent spacings.
It turned out that the SPS parag omitted some fields of the "rooster" entry, and the last of the 9 sub-entries entirely. But fortunately those omissions were at the ends of the entry, and did not affect the spacing between the seven 主s.
The other two cribs were found basically by the same method, but now by comparing other entry-parag pairs that were identified with the help of the first crib. I noticed that 气 qì was a very common character. It always occurred as a word by itself (rather than part of a compound) and had a specialized meaning that was unlike to have multiple "translations" into Voynichese. That made it a good candidate for a crib. And indeed, comparing the pairs that I had, the positional correlation between 气 and Chedy was rather obvious.
The last crib, 久服 = qokaiin, was found the same way. But this one is still not entirely certain, because it is not clear whether qokaiin is 久, 服 or the compound; and it may be that the thing is sometimes translated as qokeedy instead (like English translations of 久服 sometimes say "prolonged intake", sometimes "long-term intake", etc.)
Quote:This broad method has been attempted in many languages, and it is always possible to identify potential cribs by assuming high frequency words correspond.
That works best on languages that have high-frequency function words, like articles, prepositions, copulas, etc. Unfortunately Chinese does not have such things; and most words in the SBJ are names of diseases -- which do repeat, but not often enough to be useful with the limited pairings I have. Also, some common Chinese characters like 不 bù = "not" are hard to use as cribs because it is possible that they are translated into Voynichese in several different ways. For example, 不饥 = bù jī , literally "not hungry", is listed as an effect of long-term use of Chinese onions; in the English translation I got, it is rendered as "prevents hunger".
Quote:As an aside, you may also want to check if the Shennong Bencaojing has a Zipfian distribution of characters. Usually Chinese characters are not distributed that way, which is another reason I have doubted a Chinese source
It is an interesting question, but, whatever its result, it will not affect the "SPS=SBJ" claim. For me, it would be like checking whether the table of Catholic patron saints for the days of the year has a Zipf-like word distribution.
Again, statistics -- like word and character frequencies, entropy, distance correlations -- are not properties of a language. They are properties of a specific text, or of a corpus (collection of specific texts). English does not follow Zipf's law. Most English novels and newspaper articles do. But one probably can construct a grammatically correct and meaningful English text of 10000 10003 words that repeats every word exactly seven times.
All the best, --stolfi
So let me share a few ideas.