The Voynich Ninja
What if Voynich plaintext is very heavily shortened? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: What if Voynich plaintext is very heavily shortened? (/thread-3962.html)

Pages: 1 2


What if Voynich plaintext is very heavily shortened? - Rafal - 12-03-2023

This is my first post here so hello to everybody Smile

I've been reading about Voynich manuscript for some time and like many people here I'm coming to conclusion
that it can't be a simple substitution cipher, letter for a letter.

It also seems to me quite probable that we shouldn't treat each of Voynich letters independently but rather work
with common groups of letter like or, ol, ain etc.

If we did so, another problem emerges however. If we assume that groups of Voynich letters code single letters in the plaintext
then the supposed words in the plaintext become really short - usually 2 or 3 letters.

If we treat space as space then text made of such words would be generally very improbable, unless we assume that the words
are heavily shortened.

My question is - would you accept and under what conditions such a theoretical solution as below?

D scn ol e prd lc e scn prr
i lbt or amp stn obd ol rcn omp cnd cng un
aq ur dmd inf tm lu cqt lnt agt

I created it by radical shortening of this text taken from "De materia medica":  You are not allowed to view links. Register or Login to view.

De Sicyonio oleo. E praedictis licet et Sicyonium sic parare .
In lebetem oris ampli stanno obductum olei recentis omphacini candidique congium unum ,
aquae vero dimidium infundito , tum igne levi coquito , leniter agitando.

And another question - if the original text is shortened like this, would anybody different from the creator of it be able to read it?
I realise that people in Middle Ages used abbreviations much more than we use today but everything has its limits I guess.


RE: What if Voynich plaintext is very heavily shortened? - Koen G - 13-03-2023

I don't think anyone could retrieve the information from a text shortened in this way.

However, medieval people did memorize much more than we do today, and it was not uncommon (especially for certain professions) to commit entire works to memory.

Hence, one possibility is that the information in the Voynich is just a framework for someone to remember the full information they have memorized. Which would really be a worst case scenario for us, because in that case it is possible that the key to the text was in the mind of someone who died half a millennium ago.


Now on the other hand, if a text were shortened to the point of being impossible to read, why add the two additional steps of a verbose cipher and different glyphs on top? It feels like double overkill.


RE: What if Voynich plaintext is very heavily shortened? - nablator - 13-03-2023

Welcome to the forum.

There is a website that helps you find Latin words when some letters are missing. You can try and see for yourself how many possibilities there are: You are not allowed to view links. Register or Login to view.

The text was never mangled so badly in manuscripts. They didn't just drop letters randomly and they used different abbreviation signs as well.


RE: What if Voynich plaintext is very heavily shortened? - MarcoP - 13-03-2023

Hi Rafal, in addition to what Koen and Nablator wrote, I'll add that You are not allowed to view links. Register or Login to view. has been discussing similar ideas for many years:

Nick Pelling Wrote:...Voynichese combines scribal abbreviation with strong elements of verbose cipher, along with other cipher tricks...

A verbose cipher accounts for the low character entropy and abbreviation accounts for the resulting insufficient average word length. One of the other tricks that Nick considers is the presence of nulls (which was indeed common in ciphers from that time). The gradual shift between Currier A and Currier B could suggest that the encoder(s) had some freedom in mapping plain-text words to cipher words (the idea of nulls could fit here). I think this implies that a single plain-text word can be encoded by different cipher words (e.g. Herbal A 'chey' could be equivalent to Herbal B 'chedy') with abbreviation also implying that a single cipher word can encode different plain text words (Sicyonio, Sicyonium -> scn, in your example).
If one assumes that, in a small enough subset of the text, each plain-text word is encoded consistently, it should still be possible to detect some grammatical patterns, like function words; this does not seem to be the case (Gaskell and Bowern, You are not allowed to view links. Register or Login to view.).

Gaskell and Bowern Wrote:A notable feature of the VMS that has to our knowledge only been discussed by one other publication [20] is positive autocorrelation of word lengths. Word lengths in most meaningful texts are negatively autocorrelated: that is, long words tend to be interspersed with short words (long-short-long-short). By contrast, the VMS exhibits positive autocorrelation (long-long-short-short). Positive autocorrelation is only observed in a limited number of natural languages, but is common in gibberish (Figure 3).

The negative auto-correlation in a language like Latin is of course due to how shorter function words tend to alternate with longer content words.

De Sicyonio oleo. E praedictis licet et Sicyonium sic parare .

Word-length autocorrelation is only a special case of a more general Voynichese phenomenon: the fact that spatially close word tokens tend to be similar. Though cases like You are not allowed to view links. Register or Login to view. 37-39 are clearly extreme, the general phenomenon is a very prominent feature of Voynichese. It appears in both computer-generated (Timm and Schinner) and human-generated (Gaskell and Bowern) gibberish, but explaining it under the assumption of a word-to-word mapping with a natural language like Latin is hard.

<f75r.37,+P0>    qokedy dy sheety qokedy qokeedy qokechdy lol
<f75r.38,+P0>    qokeedy qokeedy qokedy qokedy qokeedy ldy
<f75r.39,+P0>    yshedy qokeedy qokeedy olkeedy otey koldy


RE: What if Voynich plaintext is very heavily shortened? - nablator - 13-03-2023

(12-03-2023, 11:51 PM)Rafal Wrote: You are not allowed to view links. Register or Login to view.It also seems to me quite probable that we shouldn't treat each of Voynich letters independently but rather work with common groups of letter like or, ol, ain etc.

The idea of tokens formed by common groups of glyphs is a very natural one, however it can be observed that their frequency is extremely inconsistent, ranging from 0 to a lot, and therefore difficult to reconcile with a simple interpretation as a plaintext letter or group of letters.
You are not allowed to view links. Register or Login to view. has a lot of or (in 31% of words)
You are not allowed to view links. Register or Login to view. has a lot of al (in 37% of words)

The frequency of the most common ending, -dy, drifts continuously from 0 to nearly 50% of words.

   


RE: What if Voynich plaintext is very heavily shortened? - Koen G - 13-03-2023

(13-03-2023, 10:52 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.The idea of tokens formed by common groups of glyphs is a very natural one, however it can be observed that their frequency is extremely inconsistent, ranging from 0 to a lot, and therefore difficult to reconcile with a simple interpretation as a plaintext letter or group of letters.

Voynichese as a whole is not a consistent system though, so wouldn't this argument work against most cipher-based solutions? We're always dealing with one subsystem shifting into another, or certain preferences dominating different sections.


RE: What if Voynich plaintext is very heavily shortened? - nablator - 13-03-2023

(13-03-2023, 11:01 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Voynichese as a whole is not a consistent system though, so wouldn't this argument work against most cipher-based solutions? We're always dealing with one subsystem shifting into another, or certain preferences dominating different sections.

Most cipher-based solutions, yes. Some proposed solutions are:
1. Some parts are nulls. Pig Latin and other verbose expansions are possible. In steganography most of the text is filler.
2. Variable settings cause differences in statistics.
3. There are multiple plaintexts for the same chunk of ciphertext (stateful system).
4. Meaning, if there is any, is spread over larger chunks of ciphertext than glyphs and glyph bigrams (trigrams, or words: then Voynichese it is not a cipher but a code).

My preference is 2+3+4. Smile


RE: What if Voynich plaintext is very heavily shortened? - Koen G - 13-03-2023

Something like that would make sense to me. Like a Voynichese "word" corresponds to a plaintext letter, but there are different ways to express each letter. 

Whatever the case may be, I think attempts to retrieve full or abbreviated plaintext words from Voynichese words are doomed to fail. There just isn't enough information in there. (Unless there is a missing codebook).


RE: What if Voynich plaintext is very heavily shortened? - Aga Tentakulus - 13-03-2023

t can't be the word length. There are enough examples in other books. Example German. And this goes on for 200 pages.
You are not allowed to view links. Register or Login to view.
The same goes for Latin.
You are not allowed to view links. Register or Login to view.

If you read and understand the books, word and sentence repetitions are more than normal.
"and a handful of sugar and a handful of flour and a handful of sultanas and a handful of salt......."

Or, 7 x one quentin sugar, but quentin spelled four different ways.

Pointing out exception cases just to support a theory is no art.
The work of Timm/Schinner has not impressed for a long time.


RE: What if Voynich plaintext is very heavily shortened? - Juan_Sali - 13-03-2023

(13-03-2023, 10:52 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.The idea of tokens formed by common groups of glyphs is a very natural one, however it can be observed that their frequency is extremely inconsistent, ranging from 0 to a lot, and therefore difficult to reconcile with a simple interpretation as a plaintext letter or group of letters.
You are not allowed to view links. Register or Login to view. has a lot of or (in 31% of words)
You are not allowed to view links. Register or Login to view. has a lot of al (in 37% of words)

The frequency of the most common ending, -dy, drifts continuously from 0 to nearly 50% of words.
The analisys of bigrams is not enough as some of them are too common to be a plain letter, the next step is an analasys of trigrams involving those most common bigrams. The result is a set of n-gramms ( with n 1 to 3, maybe more in a few cases) of a size over 100, enough for a homophonic cipher, and enough to explain the inconsistent of frecuencies.