The Voynich Ninja

Full Version: The Voynich-Ms as a concatenation of abbreviations
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7
Interestingly, we have just seen the spelling "powder / bulver". I think it is rather exceptional that the same person changes. Even though I see it again and again from different people.

To Google:
You have to understand what's happening. Google works with 4 dictionaries and 6.5 million comparison sentences.
These come from the manuscript scanner of the various books and can be found in Googlebooks. These are from different times and regions. They may not be 100% reliable, but they are a good indication. I would be more interested in where Google found it, so one could draw more of a conclusion about the origin.
I don't claim that everything is Latin, but Latin is similar.
Perhaps "ust" is a form of "usti" ad ust = to dry.

So the application p/b, t/d, s/z, i/y, v/f is applicable, I just may not jump from one to the other as it suits me. Even though I have already seen it.
@Rene
The Uni-Eindhoven telephone answering machine starts with "Guede Dach". I don't know if I spelled it right the Dutch way, for me it means "guten Tag / good day" but still I only hear "d".  Big Grin
(27-05-2023, 10:21 AM)Aga Tentakulus Wrote: You are not allowed to view links. Register or Login to view.@Helmut
I don't know if you are addressing me.
But I distinguish between abbreviations like "RIP or PN" and "d' or v' and 9".
The abbreviation theory will not work for the whole VM text. It's just too much for that. Ergo no.

That's no reason to go off topic, please keep this about Helmut's theory.
[attachment=7374]
@Tavie
How would you rate the 98 here?
The theory is very interesting and very similar to my own personal thoughts about the text, although I am the first to admit I don't have enough knowledge to be able to work out a proper transcription and expansion of the abbreviations.

I think it is entirely possible that the text is written in a personal shorthand that mixes known abbreviations with a private system. Most probably entire syllables are concatenated down.

A similar system is the Rossinols "Grand Cipher" in use in French administration during the 18th century, although I suspect this is not nearly as sophisticated as the G.C.

 There are of course several important indicators here which we need to work through. Let us setup a thought experiment where an educated person created their own syllabary shorthand. If glyphs can stand for both common words/phrases and syllables, then we start to explain away a lot of the repetition we see in the text.

ReneZ pointed out an important check (namely the t and d phoneme post) which is similar to something I've been working on over the last couple of years. There doesn't need to be any real consistency in the text, in a syllabary shorthand we need to be looking at a consistent pronunciation. 

The real problem? Without knowing the dialect then it's nigh on impossible to decipher. However, if several different scribes worked on the project, then there has to be internal coherence that we are missing.
(28-05-2023, 07:09 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.A similar system is the Rossinols "Grand Cipher" in use in French administration during the 18th century, although I suspect this is not nearly as sophisticated as the G.C.

Do you really mean such a system ?

Quote:Rossignol, however, developed a coding system in which syllables and whole words were also replaced by numbers ....... Decryption was only possible with code tables on which the word - arranged in alphabetical order - and the corresponding numerical value were recorded.

After all, a cryptanalyst succeeded in deciphering it in 1893. He guessed that a particular sequence of repeated numbers, 124-22-125-46-345, stood for les ennemis ("the enemies") and from that information was able to unravel the entire cipher.
I am quite happy, Helmut, to finally read more details of the method you propose.

Over the years, I have been presented with many dozens of proposed solutions, and when reading these, I am always looking for 'anything special' that makes it stand out. For 'anything promising' that at least begins to explain the unusual text features, but also for the 'typical mystakes' that show that the proposer of the solution is not aware of these.

You proposal is certainly different from most, and does not include the 'usual mistakes' which appear when the proposed method is basically a character substitution with various levels of freedom added to allow for more words to be translated.
On the other hand, I am also not seeing the 'promising bit', which is "any reason why the Voynich MS text behaves as it does".

Here, I would not refer to Currier's "line as a functional unit" because that is defined so vaguely that I don't see a way to measure it in practice. The recent work of Patrick Feaster does a much better job, as it is quite specific and presents several examples that require an explanation.

Apart from that, and more in general, the low entropy of character pairs (bigrams) and the observed word structure are not explained by your method, but seem to have 'just occurred'. This does not necessarily mean that it is wrong, but it is cetainly a shortcoming.

I also agree with the points of nablator (and others), that there is an issue with the high frequency of text phrases. Even for a bigram, the high freqency of Eva-dy is unusual. If this is to be expanded to a word (e.g. dictum), then this word appears way too frequently.
Again, this is not proof that it is wrong, but it makes the proposal much less convincing.

Adding text by expanding abbreviations means that one is adding information that previously was not part of the text.
This can only be done if it is done consistently, and it is confirmed by presenting a meaningful text. I am afraid that this last confirmation is missing, so we cannot be sure that the expansino of abbreviations is correct.
I don't know how many members of this forum study Latin at school. I did study it and, although I have been forgetting it, it helped me to lose the reverential respect that is usually held for it.

Every time I read that the script can be Latin, or a sequence of Latin abbreviations as proposed here, I never cease to be amazed. I always ask myself and others the same question: Can anyone believe that more Latin can be known today than the educated people at Emperor Rudolf II's court in Prague knew?

Whenever I have said the same thing, someone has replied that in the 17th century they would have already forgotten the system of abbreviations and ligatures of the 15th century. And with this type of approach I am already unable to follow the debate.
There's a distinction between Latin scripts and Latin language. We are writing in a form of Latin script right now. Medieval manuscripts in the vernacular were also written in a variety of Latin scripts.

There is not a doubt in my mind that the VM script is most closely related to Latin script and numerals. The numerals are of course not of Latin origin, but they appear in the form used in medieval Latin scripts. But this tells nothing about the underlying language, if there is any.

So anyway, the "script" part of Helmut's theory is not controversial.
(28-05-2023, 07:09 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.Let us setup a thought experiment where an educated person created their own syllabary shorthand. If glyphs can stand for both common words/phrases and syllables, then we start to explain away a lot of the repetition we see in the text.

I don't see how a syllabary shorthand should result in the 1% word repetitions we see in the Voynich ms. Lindemann and Bowern (You are not allowed to view links. Register or Login to view., 2021) compared historical texts in modern "normalized" transcriptions and more faithful transcriptions that actually included the abbreviations used by scribes. They found that "abbreviated Latin has higher conditional character entropy than unabbreviated Latin": since Voynichese has a particular low conditional entropy (see also You are not allowed to view links. Register or Login to view.), it diverges from ordinary texts in the opposite way than actual Medieval abbreviated texts. If anything, the increased entropy of actual abbreviated texts suggests fewer repetitions, rather than more of them.


Moving from actual evidence to speculation, one can imagine a very "lossy" abbreviation system where several plain-text words are mapped into the same shorthand word. An extreme example is only keeping the first and last character of each word, so that 'the', 'tree' and 'tease' are all abbreviated as 'te'.
Applying this system to Mattioli (Latin) and the Genesis from King James Bible (English) results in a rate of repeated words similar to Voynichese in the case of Mattioli: ~1%. The "abbreviated" English Genesis stops at about half that rate.

But one should also note that repetitions in the Voynich manuscript exceed the number observed in a randomly scrambled version of the text: i.e. these repetitions do not appear to be accidental. In the case of the possible occurrence of "the tree" encoded as "te te", the pattern is purely accidental: there is nothing that makes the occurrence of two t*e words particularly likely. On the contrary, in the case of Voynichese, there is something that makes identical (and  very similar) words appear consecutively. This plot shows actual repetitions (Y) vs repetitions in the scrambled text (X). Voynichese samples appear above the y=x line.

[attachment=7388]

Finally, this extreme shorthand also results in a very low MATTR. With a window of 1000 words, Voynichese shows a value close to 0.5, lower than Latin but higher than English, perfectly compatible with several European languages. But the extreme abbreviation that can result in Voynichese-like repetitions also reduces MATTR1000 to below 0.2, quite below Voynichese and actual written texts.

[attachment=7387]

I know that this lossy shorthand is not what Helmut is proposing: it is just my guess at something that might seriously increase repetition (though not above the rate of a scrambled text). I am very curious to understand how the system discussed in this thread is supposed to produce the observed rate of repetition without significantly reducing MATTR.
Pages: 1 2 3 4 5 6 7