(Yesterday, 10:52 AM)Antonio García Jiménez Wrote: You are not allowed to view links. Register or Login to view.The use of the terms word and text is not irrelevant, as it influences our perception.
That is true. However, the word "word" can mean very different things depending on the language.
The Chinese generally thought of each syllable (= each Chinese character) was a "word", until Westerners found that concepts that were one word for them usually mapped to two syllables in Chinese. Then, when transcribing Chinese with Latin letters, they often would run those two syllables together. As in "Beijing", "wushu", "wonton", "jiaozi"... AFAIK that is the case also for Tibetan, Vietnamese, and other "monosyllabic" languages.
A compound term is two or more word that have a specific meaning that cannot be inferred from the separate components. In many languages, some compound terms are often written as a single word, with or without hyphens: "typewriter", "windshield", "inkwell"... while other compounds are written separately: "pit stop", "solar sail", "gas pump", ...
In Italian the postfix oblique pronouns and some modifiers are traditionally written attached to the verb: "portioamocela" = "let's take it there", "ditecelo" = "tell it to us". In Spanish and Portuguese they may be hyphenated to the verb. In other languages (like English) they are separate words.
The definite article "al-" in Arabic script is written attached to the noun. In Gaelic, IIUC, the definite article is a suffix attached to the word. In most Romance it is a separate word, but in Italian it is sometimes attached with an apostrophe: "l'uomo".
In Romance languages one usually turns an adjective into and adverb by inflecting it as feminine singular and attaching the suffix "-mente", "-ment", or similar; equivalent to English "-ly". "lento" -> "lentamente". However, phonetically that suffix behaves like a separate word, and should logically be written as such: "lenta mente" or "lenta-mente".
And don't ask about Turkish or Hungarian...
So, using the word "word" to mean "sequence of Voynichese glyphs separated by thin spaces and delimited by wider spaces" should not lead us astray, if we keep aware that "word" can mean different things in different languages.
Besides, the Voynichese "words" defined this way have properties that are shared with what one consider "words" in most languages. They behave as atomic units with respect to line breaks (so far we have not found evidence of hyphenation in the VMS). They have a rather rigid structure, and labels are often a single "word" with that same structure. The number and frequency of such "words" follow Zipf's law. The entropy per word is within the range of natural languages with their traditional scripts. And so on...
All the best, --stolfi