The Voynich Ninja

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13

(09-08-2020, 01:12 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Also, I think that Hungarian is by no means a Slavic language.

It's circled, along with nearby non-Slavic Greek and Albanian, meaning that it's near the Slavic languages on the plot, but isn't actually Slavic.

EDIT: Darrin's already confirmed this.

Yes, I just was slow in interpreting those circles... Rolleyes

(09-08-2020, 09:57 PM)dvallis Wrote: You are not allowed to view links. Register or Login to view.

Here you go.

Full details of my implementation at this link: You are not allowed to view links. Register or Login to view.

Thank you. This is very helpful.

Okay... I could follow this through and replicate it but... I am always strapped for time and I frequently have to work weekends (I am running a business and global commerce means customers message me every hour of the day and night) so... since you already have this set up, could you try this? The results might be helpful...

Take your plaintext VMS file and make the following adjustments (this more closely simulates the shapes of VMS glyphs)...

Scenario #1:

Change EVA-c (the long-cee) to Capital C (this makes space for the next transformation so it doesn't over-write the first one)
Change EVA-e to lowercase c
Change EVA-v to the caret symbol
Change EVA-n to v

These are minimal substitutions that don't alter anything in terms of representing VMS glyphs, but they might provide a feel for how much the transliteration system itself influences where VMS falls in the graph (it may only move slightly, but it will help to visualize this mathematically). Think of it as a simple baseline check.

Scenario #2:

I don't think we can ignore the fact that there are 9 VMS glyphs that could potentially represent numbers as they were drawn in the medieval period (o i s q l r v d y). This is without counting the possibility of numbers in the style of Roman numerals (constructed with minims, etc.). But your study is primarily linguistic, so we'll put that on the table for now.

Scenario #3:

This is more theoretical, but since a large number of solvers have described the VMS text as Latin, and since it includes many Latin characters of the same shapes and positions, here is something that might be worth trying just to see where it ends up on the graph. This is based on an assumption, the assumption that the VMS glyphs that are shaped as abbreviations actually are abbreviations and need to be expanded in order to represent all the letters.

Change EVA-c (the long-cee) to Capital C (this makes space for the next transformation so it doesn't over-write the first one)
Change EVA-e to lowercase c
Change EVA-v to the caret symbol
Change EVA-n to v
(The above are the same as scenario #1)

Change EVA-r to er
Change EVA-m to ris
Change EVA-g to cis
Change EVA-q to con
Change EVA-s to cir
Change EVA-p to pro
Change EVA-f to per
Change EVA-y to um

This is a simplistic mapping (and even if the VMS has abbreviations, they are not necessarily mapped the same way). In actual fact, Latin abbreviations (which were similar in most western languages) have more than one expansion (they are very flexible), but since the expansions tend to be similar in terms of which letters they represent, this might tease out a pattern.

Once again, thank you for the fuller explanation of your methodology. I have always been interested in the many aspects of data visualization (and was fractal-obsessed in my university years) but have not had much time to pursue it.

Sure. I can use a simple set of linux sed commands to make the substitutions on Voynich and re-run the scenarios above. Will let you know how it goes.

Note that the letters in all samples have been shifted to lower case, so what do you want to do about capital C in scenario 1 and 3?

(10-08-2020, 03:25 AM)dvallis Wrote: You are not allowed to view links. Register or Login to view.
Sure. I can use a simple set of linux sed commands to make the substitutions on Voynich and re-run the scenarios above. Will let you know how it goes.

Note that the letters in all samples have been shifted to lower case, so what do you want to do about capital C in scenario 1 and 3?

Some of the EVA correspondences are mapped to upper-case. How did you deal with those when you ran your original analysis? EVA-p (p) and EVA-P (P) are different.

I converted everything to lower case first, then ran the comparison. Did the same thing for all languages so they all use 26 letters and a space. My base text is the Takeshi Takahashi (TT) transliteration.

I skipped over ligatures and capitalization in my first pass. Just used the basic EVA letters in table 3. That probably gracked the structure a bit.

You are not allowed to view links. Register or Login to view.

I just noted "Often, the characters Sh, cTh, cKh, cPh and cFh are simply written as sh, cth, ckh, sph and cfh respectively."

How would you suggest dealing with ligatures to get everything into lower case?

dv, I need to think about this.

Converting EVA to lowercase collapses the character set, which will no doubt influence the outcome. EVA mapping is different from uppercase and lowercase in a traditional alphabet.

Sh (Sh) is quite different from sh (sh). It's not simply a lowercase version of the same thing as it would be in English. And it's not a simple variation of the same shape (that's an artifact of the font design... the "cap" over EVA-Sh is often disconnected from the glyph beneath it—it's possible it is three chars in Voynichese).

EVA-s often stands alone and has certain positional characteristics that are not specifically linked to EVA-h. In contrast, EVA-S doesn't typically stand alone and it is positionally linked (as a ligature) to EVA-h.

Give me an hour to mull this over (or maybe someone else has a suggestion). Even if you don't try the scenarios I suggested (I hope you will), you DO need to fix this problem in your original analysis.

Hello Darrin,

the hypervectors you use look a lot like the PRN codes used in Global Navigation satelllite systems. These are designed exactly for the purpose of being 'orthogonal' or uncorrelated in the GNSS terminology.

I understand that in your application the letter 'a' is always represented by the same hypervector, in all languages.
This is a bit of a problem, of course, especially when dealing with the Voynich MS, but also for languages like Arabic, Hebrew and even Russian.

While JKP doesn't like that the Voynich character e is represented by an 'e', this exactly as valid (or invalid) as using a 'c' to represent it.

For your Voynich experiment, I would recommend to also use a file in either the Cuva or the FSG alphabet. The former is already alphabetic, the latter can probably be made alphabetic.

An FSG files can be found at the same page to which you already put a link.

A Cuva file can be generated easily using 'sed'. I enclose one that would do it.

You are not allowed to view links. Register or Login to view.

I've been giving this some thought.

Converting to lowercase would clobber (overwrite/obliterate) 11 characters, many of which are quite important. In four instances it means overwriting common characters (lowercase) with rare characters (uppercase). Actually, it depends on how you are doing it. If you are simply adding the uppercase to the lowercase then it would be adding a few common characters that don't exist... but... that doesn't solve the problem in the case of the benched chars EVA-P and EVA-T. They would be lost, subsumed into EVA-p and EVA-t (the same with EVA-k) and the distinction between benched and unbenched might be important (they are not rare chars).

The EVA-u char is very rare. It doesn't have to be included in an analysis of the overall text. That makes one slot available that is not going to skew results. The same for EVA-z. It occurs only three or so times in the whole manuscript. That's two slots. EVA-O (uppercase) isn't needed either.

Would you like me to post an alternate mapping that would enable you to collapse to lowercase without these problems?

(10-08-2020, 07:10 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
...
While JKP doesn't like that the Voynich character e is represented by an 'e', this exactly as valid (or invalid) as using a 'c' to represent it....

By itself, this might be true, but if you are comparing to Language A with a tri-pattern of acc and Language B with a tri-pattern of lee, you will get different results depending on how it is transliterated. EVA-e is a logical one to use for experimentation. It is prevalent and concatenated occurrences of "e" and "c" characters occur in varying frequencies in different languages.

I want to see how far it will move on the graph if a small number of these glyphs were changed. It would help us visualize the extent to which the EVA mapping to plaintext influences language affinity (how to interpret ligatures is a whole other and more complicated subject).

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13

DonaldFisk

Anton

-JKP-

dvallis

-JKP-

dvallis

-JKP-

ReneZ

-JKP-

-JKP-