The Voynich Ninja
Transliteration and interpretation issues - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Transliteration and interpretation issues (/thread-3204.html)

Pages: 1 2 3


Transliteration and interpretation issues - ReneZ - 14-05-2020

(13-05-2020, 11:49 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.EVA converts the c-shapes to e-shapes for mnemonic purposes. Mnemonic text systems almost always depend on converting the VMS glyphs to something that is closer to natural language so it can be more easily remembered. To put it simply: EVA CHANGES Voynichese shapes to be closer to natural language. Many computational attacks assume the vowel-like shapes are vowels.

While one has to be careful doing numerical analysis using Eva transliterations, this argument is not valid, I am afraid.

Transliteration is not about shapes. It is about rendering the handwriting in a computer-readable form.

We simply don't know if e was meant to represent a consonant. Saying that Eva changes it, means that one might assume that it was. We don't even know if it is a complete plain text symbol or phoneme. It could just represent a minim in another script, or a diacritic. We don't know if ee is the same as two e 's and eee is the same as three e 's.
If e is a consonant, these strings are very hard to explain.
(Note that Dutch allows from 1-4 e's in a row, but I am of course not saying that it is Dutch).

Secondly, this particular analysis makes no assumption about whether any symbol is a vowel or a consonant. All are treated equally.


RE: Transliteration and interpretation issues - -JKP- - 14-05-2020

(14-05-2020, 06:27 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
(13-05-2020, 11:49 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.EVA converts the c-shapes to e-shapes for mnemonic purposes. Mnemonic text systems almost always depend on converting the VMS glyphs to something that is closer to natural language so it can be more easily remembered. To put it simply: EVA CHANGES Voynichese shapes to be closer to natural language. Many computational attacks assume the vowel-like shapes are vowels.

While one has to be careful doing numerical analysis using Eva transliterations, this argument is not valid, I am afraid.

Transliteration is not about shapes. It is about rendering the handwriting in a computer-readable form.


Transliteration is about shapes. Why were most of the transliteration shapes chosen to be similar to Voynichese glyphs? If it was not about shapes, there are many better systems that are good for computers and not good for people. I'm a software developer, I know what kinds of characters are conducive to computational attacks. The selection of plaintext characters for EVA obviously took alphabetic shapes into consideration. There's no point in pretending it didn't.

But it's inconsistent in that regard.

The Voynichese c-shape was changed to "e" instead of "c"? Plaintext e has the same "value" as a transliteration character as plaintext c, so I assume the change was for mnemonic reasons? For ease of typing? Because the shape was easier for humans to remember?

.
Unfortunately this hinders research.

People make assumptions about Voynichese based on the transliteration. If a c-shape is changed to an e-shape (both shapes have equal value as plaintext), it clearly DOES affect human perception. It shouldn't, but it does. The fault lies with the researchers, but that doesn't mean we can't try to do things that reduce this kind of misperception.

My criticism was not of EVA, my criticism was of the researchers' assumptions about Voynichese based on their use of EVA transcription systems.


Whenever I point this out, you react as though I am criticizing the EVA system. Primarily I am criticizing the researchers. The thread was not about EVA, it was about a flawed computational attacks on Voynichese (of which there are many).


I don't think we should be ignoring the fact that the transliteration system affects human perception of Voynichese. I usually only bring up subjects like this because I think it can be done better, not because I like to complain.


RE: Transliteration and interpretation issues - -JKP- - 14-05-2020

Renez Wrote:We simply don't know if e was meant to represent a consonant. Saying that Eva changes it, means that one might assume that it was. We don't even know if it is a complete plain text symbol or phoneme. It could just represent a minim in another script, or a diacritic. We don't know if ee is the same as two e 's and eee is the same as three e 's.


Of course we don't know if e was meant to represent a consonant. We don't even know if it is meant to represent a letter, but what we DO know is that the choice of the "e" shape for the Voynichese "c" shape affects people's PERCEPTIONS of Voynichese.


Every time I have criticized this choice, it has been in the context of computational attacks, an environment in which a high proportion of researchers ASSUME "e" is a vowel. I've also seen people say with strong conviction on the forum that they think it is a vowel. By transliterating "c" with a vowel, we create the ILLUSION that Voynichese is comprised of words with a natural consonant-vowel balance.


Yes, I know, I know... they shouldn't assume that. I don't assume that, but go back through all the computational attacks and what do you find? A huge proportion make exactly that assumption, just as a huge proportion assume VMS tokens are words.


RE: Transliteration and interpretation issues - ReneZ - 14-05-2020

(14-05-2020, 07:11 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.it has been in the context of computational attacks, an environment in which a high proportion of researchers ASSUME "e" is a vowel

I am not aware of any examples of this in computational attacks. However, it may be the case for 'interpretative' analyses, which is why I put this in the subject of this post.

But should d and y be interpreted as a number?
Essentially all other transliteration alphabets use 8 or 9 for this.

v101 uses 1 for ch and 2 for Sh ...

etc etc


RE: Transliteration and interpretation issues - Koen G - 14-05-2020

Isn't there a problem with glyph-level (as opposed to "word" level) attacks that simply use EVA as input? For example those that detect consonant-vowel alternation patterns?

That said, I'm not sure if we would have advanced any further if the mainstream transliteration system was one that's unpronounceable.


RE: Transliteration and interpretation issues - ReneZ - 14-05-2020

(14-05-2020, 09:23 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Isn't there a problem with glyph-level (as opposed to "word" level) attacks that simply use EVA as input? For example those that detect consonant-vowel alternation patterns?

Yes, there is, but the same problem exists with all other transliterations.

This is also why lately I have used several alphabets in parallel, as this allows to show the impact of this.
The best way to handle this problem in general is to realise (and state) that results are based on certain assumptions or conditions.

A typical case is the word length distribution. The result depends on the alphabet used, so it should be stated.

Word-level statistics (e.g. repeating strings) also depend on the transliteration used. In v101 you will find fewer than in extended Eva, while in Currier or basic Eva you will find more.


RE: Transliteration and interpretation issues - MarcoP - 14-05-2020

(14-05-2020, 06:40 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.The Voynichese c-shape was changed to "e" instead of "c"? Plaintext e has the same "value" as a transliteration character as plaintext c, so I assume the change was for mnemonic reasons? For ease of typing? Because the shape was easier for humans to remember?

.
Unfortunately this hinders research.

In my opinion, research makes progress when:
  • skilled people work on the subject;
  • they share their results.

EVA is neutral for the first point and very effective for the second.

EVA obviously is not a problem for skilled researchers. They understand what it is; then they use it properly (e.g. Timm and Schinner) or go for another option (e.g. Lisa Fagin Davis).

On the other hand, the existence of a standard makes it easy to share results. In particular, the fact that EVA is pronounceable and easy to learn (because Voynich characters are mapped into similar Latin characters) makes it accessible not only to skilled people but to everyone, including myself, who could never memorize any other transliteration system (I must continually refer to a table when reading, say, something in Currier's system). EVA makes it possible to reach as wide an audience as possible. This forum is itself evidence of how nicely things can work in that respect.

If someone who is not committed enough or smart enough to understand EVA makes a wrong assumption, this really has no impact on the advancement of the field. The fact that unskilled people cannot contribute is totally unavoidable.


RE: Transliteration and interpretation issues - Mark Knowles - 14-05-2020

(14-05-2020, 09:59 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.This is also why lately I have used several alphabets in parallel, as this allows to show the impact of this.

...

Word-level statistics (e.g. repeating strings) also depend on the transliteration used. In v101 you will find fewer than in extended Eva, while in Currier or basic Eva you will find more.

Yes, I think the key is not necessarily putting all one's eggs in one basket. It seems to me that different alphabets have different advantages/disadvantages in different contexts, so in one situation one alphabet or alphabets may be preferable to another context where one other alphabets or alphabets are preferable.


RE: Transliteration and interpretation issues - Anton - 14-05-2020

(14-05-2020, 06:27 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Note that Dutch allows from 1-4 e's in a row

Four e's in a row?  Rolleyes Dutch beats Russian, we have only three to my knowledge. But Russian can have six consonants in a row, and that word is anecdotic.

German is the consonants champion, with such words as Schmetterlingsschwimmen.


RE: Transliteration and interpretation issues - Koen G - 14-05-2020

Dutch is similar when it comes to consonants. A famous example is "angstschreeuw" (scream of fear). Apparently less common words like "slechtstschrijvend" can get even more consonants. With the previous spelling rules, "koeieuier" was the word with the longest sequence of vowels. Since 1996 the most is six, "zaaiuien". 

It must be noted that diacritics are always used to break up sequences of more than two of the same vowel. Either with umlaut, "tweeën" or with "-" as in "zee-eend". So a transcription by someone who does not know the writing system may not result in four identical characters in sequence.