(09-08-2020, 01:12 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Also, I think that Hungarian is by no means a Slavic language.
It's circled, along with nearby non-Slavic Greek and Albanian, meaning that it's near the Slavic languages on the plot, but isn't actually Slavic.
EDIT: Darrin's already confirmed this.
Yes, I just was slow in interpreting those circles...

Sure. I can use a simple set of linux sed commands to make the substitutions on Voynich and re-run the scenarios above. Will let you know how it goes.
Note that the letters in all samples have been shifted to lower case, so what do you want to do about capital C in scenario 1 and 3?
I converted everything to lower case first, then ran the comparison. Did the same thing for all languages so they all use 26 letters and a space. My base text is the Takeshi Takahashi (TT) transliteration.
I skipped over ligatures and capitalization in my first pass. Just used the basic EVA letters in table 3. That probably gracked the structure a bit.
You are not allowed to view links.
Register or
Login to view.
I just noted "Often, the characters Sh, cTh, cKh, cPh and cFh are simply written as sh, cth, ckh, sph and cfh respectively."
How would you suggest dealing with ligatures to get everything into lower case?
dv, I need to think about this.
Converting EVA to lowercase collapses the character set, which will no doubt influence the outcome. EVA mapping is different from uppercase and lowercase in a traditional alphabet.
Sh (Sh) is quite different from sh (sh). It's not simply a lowercase version of the same thing as it would be in English. And it's not a simple variation of the same shape (that's an artifact of the font design... the "cap" over EVA-Sh is often disconnected from the glyph beneath it—it's possible it is three chars in Voynichese).
EVA-s often stands alone and has certain positional characteristics that are not specifically linked to EVA-h. In contrast, EVA-S doesn't typically stand alone and it is positionally linked (as a ligature) to EVA-h.
Give me an hour to mull this over (or maybe someone else has a suggestion). Even if you don't try the scenarios I suggested (I hope you will), you DO need to fix this problem in your original analysis.
Hello Darrin,
the hypervectors you use look a lot like the PRN codes used in Global Navigation satelllite systems. These are designed exactly for the purpose of being 'orthogonal' or uncorrelated in the GNSS terminology.
I understand that in your application the letter 'a' is always represented by the same hypervector, in all languages.
This is a bit of a problem, of course, especially when dealing with the Voynich MS, but also for languages like Arabic, Hebrew and even Russian.
While JKP doesn't like that the Voynich character
e is represented by an 'e', this exactly as valid (or invalid) as using a 'c' to represent it.
For your Voynich experiment, I would recommend to also use a file in either the Cuva or the FSG alphabet. The former is already alphabetic, the latter can probably be made alphabetic.
An FSG files can be found at the same page to which you already put a link.
A Cuva file can be generated easily using 'sed'. I enclose one that would do it.
s/iiin/NN/g
s/iin/M/g
s/in/N/g
s/iii/M/g
s/ii/N/g
s/i/I/g
s/cth/TS/g
s/ckh/KS/g
s/cfh/FS/g
s/cph/PS/g
s/sh/Z/g
s/ch/S/g
s/eeee/UU/g
s/ee/U/g
s/e/E/g
s/a/A/g
s/f/F/g
s/k/K/g
s/l/L/g
s/m/J/g
s/n/I/g
s/o/O/g
s/p/P/g
s/r/R/g
s/t/T/g
s/y/Y/g
s/d/D/g
s/g/G/g
s/j/Q/g
s/q/H/g
s/s/C/g
s/c'/C/g
s/'//g
s/c/E/g
s/h/E/g
I've been giving this some thought.
Converting to lowercase would clobber (overwrite/obliterate) 11 characters, many of which are quite important. In four instances it means overwriting common characters (lowercase) with rare characters (uppercase). Actually, it depends on how you are doing it. If you are simply adding the uppercase to the lowercase then it would be adding a few common characters that don't exist... but... that doesn't solve the problem in the case of the benched chars EVA-P and EVA-T. They would be lost, subsumed into EVA-p and EVA-t (the same with EVA-k) and the distinction between benched and unbenched might be important (they are not rare chars).
The EVA-u char is very rare. It doesn't have to be included in an analysis of the overall text. That makes one slot available that is not going to skew results. The same for EVA-z. It occurs only three or so times in the whole manuscript. That's two slots. EVA-O (uppercase) isn't needed either.
Would you like me to post an alternate mapping that would enable you to collapse to lowercase without these problems?