Jorge_Stolfi > 27-06-2026, 03:51 AM
(27-06-2026, 01:06 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(26-06-2026, 04:12 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.But that is not true for the Chinese script -- because it is not phonetic! In Chinese writing, each symbol (or pair of symbols) represents directly a concept.It is not that simple. There is a large group of characters which consist of a radical plus a sound element. At some point in time, for some version of the language, these sounds helped define the character set.
Jorge_Stolfi > 27-06-2026, 04:35 AM
(27-06-2026, 02:34 AM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.This is an interesting case, but let's consider another one.
Quote:For your "and", the percentage of [s] at end of preceding words is 26.1% -- still higher than average, though not by as much.
Quote:Neither your example nor mine shows quite the kind of word profile we actually find around word-break combinations with skewed statistics in the Voynich Manuscript, in my experience. Importantly, I don't recall seeing anything like your "vertues" or my "they" that suggests whole words are driving the patterns as opposed to individual morphological elements -- glyphs, bigrams, and such.
Quote:But even if we look just at Culpeper, that 36.1% for [s] before [are] seems to demonstrate that the kind of statistic we're considering here can be meaningful and revealing.
rikforto > 27-06-2026, 05:52 AM
(26-06-2026, 08:13 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The hanzi 人 does not represent any specific string of sounds.The way this is true cannot bear the load resting on it.
JoJo_Jost > 27-06-2026, 06:01 AM
Jorge_Stolfi > 27-06-2026, 01:40 PM
(27-06-2026, 06:01 AM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.@ Stolfi, The problem I see is that you have to make more and more assumptions to support this theory.
Quote:If you look at the structure of the “edy” families, it is broader and deeper.
For example, I could remove all standalone “shedy”s, and the effect would roughly remain the same. Precisely because there are other cases with a glyph preceding them: “dshedy,”, “olshedy,” even “qokshedy,” and other variants.
Quote: And here, too, you need yet more assumptions to argue this within the context. So you can make the assumption that qo = “and,”
Quote:Explaining all these structures within the context of a language in this natural form, however, will be difficult—very, very difficult...
JoJo_Jost > 27-06-2026, 02:35 PM

pfeaster > 27-06-2026, 04:43 PM
(27-06-2026, 04:35 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.My point is that the statistics of the character "-s" before "are" do not tell us anything except that there is some deviation from statistical independence. That alone is neither meaningful nor revealing. We only begin to understand why the deviation exists by looking at the words that caused it.
(27-06-2026, 01:40 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.In your example, it was still the case that the words that were enhanced were highly correlated with the ending "-s". But the feature that actually determined the enhanced frequency was not that letter per se, but whether the first word was plural or singular (or a second-person pronoun). It so happened that, in English, words that end with "-s" are often plural, and vice-versa. But there are many exceptions, like "they", "men", "women", "teeth", "feet", "people", "children", "mice", "dice", "sheep", "fish", "deer", "phenomena", "fungi", ...
If one focuses on the statistics of characters, like the final "-s" and initial "a-" one will never realize that a certain set of words -- including many that do not end in "-s" -- are the real cause of the statistical anomaly. One would get forever stuck at puzzling "why is the ending -s attracted to a leading a-?"
Jorge_Stolfi > 28-06-2026, 02:27 AM
(27-06-2026, 04:43 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Here's a playful analogy: In a base-10 system, all multiples of five end with the digit 0 or 5. For example, 12345 and 43210 are both multiples of five. My position: Investigating that final digit looks like it might be worthwhile. ...
Quote:And another playful analogy: Consider a hypothetical text of unknown character in an unknown script and language. The characters found most often at the ends of words in general are @ (12%), $ (10%), # (8%), & (7%), and ^ (6%). But if we limit our study to the words that appear before one particular word -- ~+*/% -- the character most often found at the end of those words is, by far, ^ (60%), although no specific word or words predominate among them; we just find a lot of different words that end in ^.
My position: Golly, that's remarkable! What kinds of explanation(s) could we find that would be consistent with a pattern like that? Could this give us a clue as to the structure of a language? Or the mechanism of a cipher?
Your position: That statistic alone tells us nothing. Not only that; the type of statistic is all wrong as well. We instead need to be looking at the words that most frequently precede ~+*/%. Anything else is just going to cause confusion.
Quote:If we were trying to "decipher" Culpeper (and had no knowledge of the English language), the observation that words ending in [s] are so much more common than usual before [are] would actually be a useful crib in practice -- more useful, I'd say, than any statistics about the specific words that precede [are]. It may reveal nothing in itself, I suppose, but for someone seeking a linguistic solution, I believe it would suggest the hypothesis that [s] is a morphological marker that correlates somehow with the word [are].
Quote: But noticing the pattern (and pondering possible explanations for it) would be a step in the right direction, and enough steps in the right direction might eventually lead to a solution.
Quote:On the other hand, with an analytical/isolating language like Chinese, I suppose no such clues should exist -- which I suspect may be why you're eager to rule them out in the case of the Voynich Manuscript.
Quote:this researcher you posit who gets "forever stuck" strikes me as rather a naive kind of straw man.
Quote:I could equally posit a researcher who somehow identifies and collects hundreds of English plurals without ever noticing that they tend overwhelmingly to end in [s] -- remaining convinced a priori that "statistics about characters tell us nothing."
Quote:If you could have an equally revealing data point about Voynichese -- no more revealing, no less revealing -- would you want to have it?
JoJo_Jost > 28-06-2026, 04:39 AM
JoJo_Jost > 28-06-2026, 06:29 AM