The Voynich Ninja

Pages: 1 2 3 4 5 6 7

(25-06-2025, 03:04 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I think no matter what algorithm one can propose for randomly generating a lot of pseudo text, it's probably quite trivial to adapt this mechanism for actual encoding, just by adding a bit of constraints on the randomness

I must profess to being largely ignorant about complex cypher algorithms. I can understand why an item of correspondence might need to be encoded so that no third person would be able to read it. Such a correspondence would need to be decyphered only once and read only once. But I cannot understand why a manuscript of this size should need anything similar. If it is intended to be a reference manual for its owner then it will just be too awkward to have to decypher at each reading.

(25-06-2025, 03:04 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I think it makes sense to test how this would work for non-European languages

I don't think the word-splitting and matrix method would work on normal languages, European or non-European. It works on the VMS because of the curious property of the gallows characters. Gallows words make up about half of the words in the VMS and the majority are k t words. Most of these are placed mid-word and my tables show that they can split words into two independent parts. The words appearing in the matrices make up 30% of all the language B words. No other character in the VMS is able to do anything similar. And this is evidence of artificial construction.

Also I don't know much about non-European languages. I will have to leave it to other people to try this method on those languages.

(25-06-2025, 03:04 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view. my tables show that they can split words into two independent parts.

Could these be nulls? Or encoded space characters?

(25-06-2025, 05:32 PM)davidma Wrote: You are not allowed to view links. Register or Login to view.Could these be nulls? Or encoded space characters?

It is actually me you quoted.

Are you asking if the gallows characters could be nulls? All of the one-leg, two-leg, bench varieties? Highly unlikely. ~50% of words are gallow words. Their curious forms are one of the defining features of the manuscript and they probably have a higher purpose than just being nulls.

(25-06-2025, 04:46 PM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.Also I don't know much about non-European languages.

Mandarin Chinese words have a fixed structure A B C where
A is zero, one, or two consonant sounds (e.g. b, d, h, k, ʃ, m, tz, ts, tʃ etc.)
B is one, two or three vowels (e.g. a, e, u, ü, ai, uo, iao, ...)
C is empty or a nasal consonal (n or You are not allowed to view links. Register or Login to view.)

The core B can be pronounced in one of four tones (pitch patterns). That is usually indicated by a diacritic on one of the vowels. But another spelling system instead uses a fourth segment D that is a digit 1-4 appended to the word. Words do not inflect -- there are no such things as the case, number, gender, mood, tense, and person concepts of IE languages.

Many other languages of East Asia have a similar word structure, except that the choices in each slot (including the tone) can be very different.

In Semitic languages like Arabic, Hebrew, and Ge'ez, the standard word structure is
P V0 C1 V1 C2 V2 C3 V3 S where C1,C2,C3 are consonants that determine the basic meaning, V0,V1,V2,V3 are either empty or vowels that determine the inflection, and P, S are either empty or certain short prefixes (like the Arabic definite article "al-") and suffixes. There are three long vowel sounds and three short ones; in Arabic both are "a", "i", or "u". In traditional Arabic and Hebrew spelling the short vowels are omitted and the prefix and suffix are attached to the word. When Arabic is rendered in Latin letters, all vowels are written (but the crucial long-short distinction may be lost) and the article "al" is usually hyphenated, as in "al-Andalus"

All the best, --stolfi

Thai syllables are similar to the Chinese words above:

A is one or two consonants, but the pairs are very limited. Essentially, the second one can be L or R following a limited set of other consonants
B is a vowel or diphthong
C is empty or a consonant

Syllables may not be complete words. W and Y count as consonants even if at the end of words they sound like vowels.
There are five tones, of which one is 'neutral'. Mandarin also has this but there it is quite rare.
Special for Thai is that vowel length is meaningful, and there are a great many minimal pairs distinguished only by vowel length.

@dashstofsk Whenever you respond to my posts, you always act as if it’s been proven that the VMS is a hoax (if I’ve understood you correctly; correct me if I’m wrong).

So I took the time to look at what you’ve written and found this thread. There are methodological flaws here, but let’s set that aside for now:

It has been shown by Patrick Feaster (2020) that over 90 percent of word boundaries can be explained by 7 simple rules. This is strong evidence that the word boundaries were artificially inserted as an additional “cipher level.”
How exactly this was produced - whether as part of a generator or truly artificially using rules in a cipher - is open to debate in this context.

But it is also a strong indication that the spaces in the VMS are NOT real word boundaries - at the morphological level - which is what you’re arguing here -in fact, with certainty, because there is no natural language that could split words using just 7 rules. So there’s already a major catch here.

But not enough: You can also test this: If you ignore the spaces in Latin and split at random points (keeping the same word-length distribution), the assumed dependency between word beginnings and ends sometimes even collapses to values lower than those in the VMS! (and the same happens in MHD) So if you assume that the tokens are not “words,” your whole neat theory falls apart.

So, according to your logic, would Latin without spaces be “even more artificial” than VMS? The test does not measure a property of the construction. It measures a property of the cutting.

Thus, your findings say something entirely different from what you claim. They demonstrate the conspicuousness of the VMS spaces, not the artificiality of the words.

I still stand by all that I wrote.

Okay, I guess that says it all. I'm out...

(13-05-2026, 06:52 AM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.But not enough: You can also test this: If you ignore the spaces in Latin and split at random points (keeping the same word-length distribution), the assumed dependency between word beginnings and ends sometimes even collapses to values lower than those in the VMS! (and the same happens in MHD) So if you assume that the tokens are not “words,” your whole neat theory falls apart.

So, according to your logic, would Latin without spaces be “even more artificial” than VMS? The test does not measure a property of the construction. It measures a property of the cutting.

Thus, your findings say something entirely different from what you claim. They demonstrate the conspicuousness of the VMS spaces, not the artificiality of the words.

I really don't understand your argument here, @dashstofsk's result shows that space separated chunks of Voynichese that contain k appear to be built by combining a set of prefixes and a set of suffixes almost uniformly, with no strong preference for any prefix+suffix combinations. As if you had a bag of prefixes and a bag of suffixes and every time you needed to write a *k* word, you would fetch a random prefix and a random suffix. So, if prefix qo is twice as frequent as prefix ol, then for any suffix there will be roughly twice as many qo- words than ol- words. As I understand it, the origin of spaces and whether these chunks are words is irrelevant here. The actual spaces are there in the manuscript, and most of them are very clear, so we definitely can split the text by these spaces and repeat @dashstofsk's result.

I don't think the experiment with Latin splits at random points would produce a similar result. But if you wish, I can run it easily, if you suggest which character or combination to use instead of k and how to split words.

I don‘t see how those tables allow the diagnose of something „artificial“.

Someone may assume that k and t
- could be consonants of a natural language
- are predecessed each by one or two out of the most frequent or all vowels of a natural language
- which may themselves are following 1-2 consonants at the beginning position
- or are predecessed by one or two consonants out of a limited, language-appropriate set

- and followed by the „second“ syllable in an equivalent structure.

Assuming these „separating“ characters as consonants, I would see that table more as an instrument to define good candidates for vowels…

Why couldn‘t the k just turn out as, for example, a „T“, the t as a „L“?
You could easily combine vowels and some „fitting“ consonants to them.
And for some possible, but in VMS just not existing „syllable“ combinations, a given natural language would be a better explanation than some artificial lore ipsum.

I think this tries to answer the wrong question, as many „calculating“ analyses.
The question should be:
which natural languages come with such a limited set of beginner and ender characters, a necessary alphabet of not much more than 22 letters, and a tendency of slight variations within words to express differentiations?

Pages: 1 2 3 4 5 6 7

dashstofsk

davidma

dashstofsk

Jorge_Stolfi

ReneZ

JoJo_Jost

dashstofsk

JoJo_Jost

oshfdk

Stefan Wirtz_2