The Voynich Ninja

Pages: 1 2

[font=Tahoma, Verdana, Arial, sans-serif]Claire Bowern explains her views in a webinar: You are not allowed to view links. Register or Login to view.[/font]

The first 30 minutes are about the history of the Voynich Manuscript.
The linguistic stuff starts at minute 33.
At minute 49 there is a part in which she talks about the experiment of writing language-like nonlanguage.

(14-07-2021, 03:14 AM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.Is it possible that there actually are two dialects being encoded here? Latin, for instance, seems to have been modified by every language group in Europe. Doesn't the difference between A and B, for instance, seem to argue for encoding two different original works (or written by different dialect-speaking scribes) in closely related 'dialects'? And could linguists perhaps derive some clues from the very frequent 89, or eva dy, in B as opposed to A, that seems to be a common ending in one but not the other?

There are key differences in the text between sections and between scribes. Which would suggest that both the content and the individual are at least partly responsible for the variety in the text. If you listened to an American talk about car maintenance and an English woman talk about healthcare, and you can imagine how different the words would be.

But those two speakers would at least share a core common vocabulary, which remains steady across all English speakers and all topics. As Marco says, that doesn't really happen in the Voynich text. The same words do often appear, but in sometimes vastly different frequencies. And the detail of word structure can vary, as bigram statistics attest.

It's almost as if we can definitely say a) that the whole text used the same script, and b) that the same general structure of words is the same, but nothing beyond that. You could suggest that the Voynich is built from text in different languages but enciphered in the same way.

I found the You are not allowed to view links. Register or Login to view. by Claire Bowern very interesting. Thankfully Claire is giving more details about the experiment to produce language-like gibberish. In You are not allowed to view links. Register or Login to view. the following description were given: "We tested this point in an undergraduate class and found that beyond about 100 words, the task of writing language-like nonlanguage is very difficult. It is too easy to make local repetitions and words from other languages." (Bowern & Lindemann 2021, p. 289).

In the webinar Claire Bowern describes again that after writing some amount of text it becomes "Hard to get ideas for words". It might be hard to invent new words but it is "easy to repeat words". It is therefore reasonable to repeat words already written instead of inventing new ones. Claire now describes that the students did start to recycle syllable shapes. For example one of the students used a lot of words similar to "kadaya" and also words similar to "gebuni". This observation does in fact match with the observed repetitive text structure in the Voynich Manuscript. One observation for Currier A is that numerous word types similar to the most frequent types "daiin" and "chol" exists. It is even possible to point to text samples like "kol chol chol kor chal sho chol shodan" on f1r.P3.16 (see You are not allowed to view links. Register or Login to view.) or "shol chol shoky okol sho chol shol chal shol chol chol shol ctaiin shos odan" on folio You are not allowed to view links. Register or Login to view..

(15-07-2021, 07:12 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.It's almost as if we can definitely say a) that the whole text used the same script, and b) that the same general structure of words is the same, but nothing beyond that. You could suggest that the Voynich is built from text in different languages but enciphered in the same way.

Thanks Emma May, I was wondering about this too, as I'm sure many have done: has someone (s) enciphered original source material possibly in different languages. I would actually find this promising, particularly for those of us working on the images. For instance I have tentatively identified one image that originally would have been accompanied by text in Old French. Others have made compelling cases for images that would have originally been accompanied by Latin, Greek, Hebrew, Arabic and even Middle English text. I'm not suggesting all of these were used, of course, but there were some educated polyglots in those days who knew many languages.

So my question would be would that person have translated into one common language before enciphering? And the answer given what you've said is likely not!

But do correct me if I'm wrong, or if what I've said here seems highly unlikely to you.

(16-07-2021, 12:04 AM)Torsten Wrote: You are not allowed to view links. Register or Login to view.I found the You are not allowed to view links. Register or Login to view. by Claire Bowern very interesting. Thankfully Claire is giving more details about the experiment to produce language-like gibberish. In You are not allowed to view links. Register or Login to view. the following description were given: "We tested this point in an undergraduate class and found that beyond about 100 words, the task of writing language-like nonlanguage is very difficult. It is too easy to make local repetitions and words from other languages." (Bowern & Lindemann 2021, p. 289).

In the webinar Claire Bowern describes again that after writing some amount of text it becomes "Hard to get ideas for words". It might be hard to invent new words but it is "easy to repeat words". It is therefore reasonable to repeat words already written instead of inventing new ones. Claire now describes that the students did start to recycle syllable shapes. For example one of the students used a lot of words similar to "kadaya" and also words similar to "gebuni". This observation does in fact match with the observed repetitive text structure in the Voynich Manuscript. One observation for Currier A is that numerous word types similar to the most frequent types "daiin" and "chol" exists. It is even possible to point to text samples like "kol chol chol kor chal sho chol shodan" on f1r.P3.16 (see You are not allowed to view links. Register or Login to view.) or "shol chol shoky okol sho chol shol chal shol chol chol shol ctaiin shos odan" on folio You are not allowed to view links. Register or Login to view..

Torsten, what is your own opinon on this? Could a complex numerical cipher account for that kind of repetition? Or could no numerical cipher account for it? I don't quite understand why we're told Voychinese has markers of a true language pattern, but the kind of results Claire reports above seem to me to clearly argue against a true underlying language. As does the entropy, and honestly just looking at the word patterns and structure, even if we take bigrams into account. So it has to be ciphered, but for me that begs the question what is the basis for saying Voychinese has markers of a true language? Do they mean an enciphered true language, or was that conclusion based on these vords as if they were language. Because if the latter, I don't buy it. It is all quite confusing.

(15-07-2021, 01:28 PM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.I did join the recent webinar and by Twitter afterwards asked Claire if a numerical cipher might change entropy. She said not normal substitution, but I asked if something a bit more complex might work. So say o was 1, but ox was 14, and or, 15. She seemed to think this might be better but the convo ended.

An example of a numeric cipher that reproduces some of the features of Voynichese is You are not allowed to view links. Register or Login to view.. You start with a short dictionary of frequent words mapped to Roman numbers; when you find a word that is not in the dictionary, you add it mapping to the next number. When the word already is in the dictionary, you simply write the corresponding number.

For instance:
c plinii natvralis historiae liber secvndvs
is ciphered as
CCLI CCLII CCLIII CCLIV CCLV CCLVI
(here there are no frequent words, and all words are mapped to incremental numbers)

Roman numbers have a low character entropy (they have a rigid structure) and consecutive numbers often differ by just one character: both features are also observed in Voynichese. Several other features of Voynichese are not reproduced by this system, but it is very interesting research, in particular because it is so simple.

(15-07-2021, 01:28 PM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.That maybe would help with the repeats. Daiin, for instance, might be 22, equal to some letter, say T, so Daiin Daiin would just be TT in the middle of a word.

When you drop the assumption that Voynich words correspond to plain-text words, the number of possible cipher systems explodes. A word-to-letter correspondence of course would be a many-to-one mapping (since there are thousands of Voynich word types and only a few tens of letters).

Something that I find a strong hint for a word-to-word correspondence is that You are not allowed to view links. Register or Login to view.. These labels appear to correspond to individual objects. E.g. the 30 stars f72v1 Libra are labelled with 30 words; about 90% of all labels consist of a single word.

(16-07-2021, 05:45 AM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.
(15-07-2021, 07:12 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.It's almost as if we can definitely say a) that the whole text used the same script, and b) that the same general structure of words is the same, but nothing beyond that. You could suggest that the Voynich is built from text in different languages but enciphered in the same way.

Thanks Emma May, I was wondering about this too, as I'm sure many have done: has someone (s) enciphered original source material possibly in different languages. I would actually find this promising, particularly for those of us working on the images. For instance I have tentatively identified one image that originally would have been accompanied by text in Old French. Others have made compelling cases for images that would have originally been accompanied by Latin, Greek, Hebrew, Arabic and even Middle English text. I'm not suggesting all of these were used, of course, but there were some educated polyglots in those days who knew many languages.

So my question would be would that person have translated into one common language before enciphering? And the answer given what you've said is likely not!

But do correct me if I'm wrong, or if what I've said here seems highly unlikely to you.

I couldn't guess as to whether the "multiple languages but same encoding" hypothesis is correct. Other hypotheses, such as dialectal changes or encoding changes, could be equally valid. It's just that we might struggle to move beyond the whole question of what's causing the difference between scribes and sections, and how qualitatively different they are. We can say that [daiin] is common in one section but uncommon in another, but what does that ultimately mean?

I think that the best route into the question is to look for those words which sit outside the general structure. Even though [daiin] is less common in Currier B than A, the actual [aiin] construction appears to be equally valid in both. But some words are simply unusual: they have bigrams or structures which aren't found in more than a handful of words. They could represent: a) mistakes, b) words which couldn't be encoded properly, or c) words which are unusual in the underlying language. If they're in groups b) or c), then we have a lot to learn from them. A good example would be words with two gallows (in any configuration): they're not that rare (several hundred exist), but they definitely unusual.

Anyway, we're quite off topic, so I'll stop here.

I think it makes more sense that there's a single language and varied encoding, because it seems harder to explain why there would be multiple languages, and easier to explain why encoding would be varied (shorter ciphertexts tend to be harder to crack). Of course, there could be both multiple languages and multiple encodings.

Since in the past it was mostly written as spoken, I must also consider the following.
Example:
In 2006, linguists studied a dialect. The locals assumed that it was an Italian dialect. The linguists explain that many words come from Slavic.
Now we have a dialect where the original language is Romance (Latin) with a portion of Slavic and Italian, and this in a German-speaking area.
If I take that as a starting point, the encoding is not necessarily the hardest thing.

And that is exactly where some of the indications from the VM manuscript point.

Translated with You are not allowed to view links. Register or Login to view. (free version)

Pages: 1 2

Torsten

Emma May Smith

Torsten

Barbrey

Barbrey

MarcoP

Emma May Smith

byatan

Aga Tentakulus