The Voynich Ninja

Full Version: Hypervector Analysis of the Voynich Manuscript
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
(09-08-2020, 12:57 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Hi dvallis,
thanks for sharing your project.
If you could explain more about the methodology you used
and address the points made by Alin_J and ReneZ, that would be great.

For those interested, heres a nice little github repo for Hyperdimensional computing projects:
You are not allowed to view links. Register or Login to view.

Off Topic: Cool that you included Sumerian and interesting where it appears on the plot.

Sure. I'll write up a follow-on article with all the methodology.
(09-08-2020, 11:52 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Thanks Darrin for this interesting analysis.

I read the linked page, and it leaves me with a number of questions.

What exactly do you mean with:

Quote:Languages in non-Latin characters were machine transliterated before normalization.

If one uses Unicode, there is no longer any issue with non-Latin characters. One can still apply the language rules to translate upper case to lower case (for languages that use two cases).

Hi Rene

I used a Linux package call unidecode from github. See the description below. I definitely need to write the "one click down" article as you guys have a lot of excellent questions.

[b]Description:[/b]
Text::Unidecode provides a function, `unidecode(...)' that takes Unicode data and tries to represent it in US-ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F). The representation is almost always an attempt at *transliteration* -- i.e., conveying, in Roman letters, the pronunciation expressed by the text in some other writing system
(09-08-2020, 01:12 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Also, I think that Hungarian is by no means a Slavic language.

Correct. That's why I circled any exceptions inside a group in red. Hungarian, Albanian and Greek are not Slavic.
(09-08-2020, 12:35 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Let's not distract from the topic of this thread.

The analysis is promising, but I have considerable difficulty in understand what exactly is being done.
I checked the link to the paper of Pentti Kanerva, but this is only concerned with the hypervectors, not the text analysis.

This piece of text:

Quote:Assign random hypervectors to the twenty six letters of the alphabet and a space character. Parse any large text sample with a sliding three letter window, building a series of trigrams. Combining the trigrams yields a hyperdimensional vector signature for the text sample.

is especially difficult to follow. It does not allow one to understand what the author has done exactly.

Yeah the math is hard to follow, so I wrote the first piece in the spirit of "Here's what I found". I thought most people would not want to slog through all the math. Except  you guys LoL.

Now I'll definitely write the next piece with all the nuts and bolts of hypervector language analysis. Its largely based on an example Kanerva proposed in one of his lectures. Took a lot of code and head scratching to implement. 

I knew the algorithm was right when language families began to show up with no input from me as I fed in the data.
(09-08-2020, 03:43 PM)dvallis Wrote: You are not allowed to view links. Register or Login to view.Yeah the math is hard to follow

Actually, the math is not difficult to follow.
I was taught linear algebra in university in 1978 and this included everything related to multi-dimensional vectors and their operations. Vectors that are just composed of numbers are the easier part.

What is hard to follow is what exactly you have done. A more detailed explanation is needed in order to be able to validate it.
(09-08-2020, 06:39 AM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.But curiously, you find Voynichese in a Caucasian language cluster, along with the North West Caucasian languages, Adyghe and Abaza.   I reached a similar conclusion (You are not allowed to view links. Register or Login to view., You are not allowed to view links. Register or Login to view.).   The closest match I found was with Abkhaz, another North West Caucasian language.  The most likely explanation, though, is simply that Abkhaz has only two vowel phonemes and there's no clear vowel branch in Voynichese.

Please note that the first I heard of Caucasian languages was last night so I am relying on what is out there on the web, not my own expertise -- but I did notice in my reading that at least Ossetic also has no articles.  So "a/an or "the" are expressed through changes internal to the word (e.g. change in stress, that is written differently to reflect this differing "strength").

I would propose that, along with the low number of vowels, this aspect of the language would also increase the chance of a match using this kind of analysis.

Of course, it would be quite exciting if these were true parallels, but until the match between alphabet and glyphs are made the possibility of coincidence needs to be seriously considered, as Don has obviously done. 

My reading indicated that at one point Ossetic used (at least in an earlier version called Alanic) the Greek alphabet, see 

You are not allowed to view links. Register or Login to view. 

which may be more more contemporaneous for the carbon dating of the VM.

Finally, my reading also said there was a dialect of Ossetic called Jassic that was spoken in Hungary, due to migration there in the 13th century, that exists as a written language through one known document, dated to 1422, that is in the Hungarian National Szechenyi Library.  It is a glossary of 34 words related to products of agriculture.  That would also be an interesting document to see.  Unfortunately, it seems that this language may be no longer spoken.
Ossetic is by no means obscure, it's spoken by half a million people.

It does not have especially low number of vowels, in fact Russian has less vowels (6) than Ossetic (7).

Alanic language is kinda proto-Ossetic, because modern Ossetians descend from Alans. Northern Ossetia (region of Russia) is also officially called Alania. The Zelenchuk inscription which I mentioned is in fact Alanic.

Strictly speaking, Jassic is not a dialect of Ossetic, Jasses were an Alanic tribe which migrated to modern Romania/Hungary in the beginning of 2nd millennium AD. So their language and the Ossetic language descend from the common origin of Alanic. Presently Jassic language is dead, modern Jasses speak Hungarian.
(09-08-2020, 06:00 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.Ossetic is by no means obscure, it's spoken by half a million people.

It does not have especially low number of vowels, in fact Russian has less vowels (6) than Ossetic (7).

Alanic language is kinda proto-Ossetic, because modern Ossetians descend from Alans. Northern Ossetia (region of Russia) is also officially called Alania. The Zelenchuk inscription which I mentioned is in fact Alanic.

Strictly speaking, Jassic is not a dialect of Ossetic, Jasses were an Alanic tribe which migrated to modern Romania/Hungary in the beginning of 2nd millennium AD. So their language and the Ossetic language descend from the common origin of Alanic. Presently Jassic language is dead, modern Jasses speak Hungarian.

Thank you for these observations - l hadn’t noticed discussion about Ossetic  vowels - just about the lack of articles.  There was some general discussion about at least some members of the Caucasian language family having only a couple of vowels - it is interesting that Ossetic is not one of them.

As for the beginnings of Jassic l was just echoing the Wiki as to the 13th century origins.  I acknowledge this could be incorrect - l certainly have no expertise in this area. Unfortunately the size of the written corpus is likely too small to be useful for this kind of analysis.  But l’d still like to see that document - l wonder what alphabet it was written in?

I wonder if the corpus of written Alanic would be large enough to use in an analysis such as this one.

In any case, l’m interested in learning more and grateful the group has members much more able to get good information about this possible direction than myself.  Of course, l am happy to lend my help if that would be useful.
(09-08-2020, 07:09 PM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.But l’d still like to see that document - l wonder what alphabet it was written in?

It's the so-called "Jassic Glossary", written on reverse of some legal paper as a memento for a 15c officer to understand the Jassic speech. I was not able to find the reproduction online, but I guess the script is Latin there, because it was composed by a Hungarian person, or for Hungarian person with the help of a Jassic-speaking person.

13c is correct, 1237 is the year of the Mongol invasion, they stroke upon Alans, hence the Jassic migration to Hungary.

Among other things Alanic, there does not seem to be much more. The Zelenchuk inscription in Greek alphabet (quite short), the Jassic Glossary, the Alanic phrases in Tsetses' comments to Theogonia (15 c) - several phrases in Greek alphabet.

(09-08-2020, 07:09 PM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.In any case, l’m interested in learning more and grateful the group has members much more able to get good information about this possible direction than myself.

I'm no expert in this, just it's likely that Russian internet has much more information on this subject than sources in English.
Here you go.

Full details of my implementation at this link: You are not allowed to view links. Register or Login to view.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13