RE: Voynich Manuscript: word vectors and t-SNE visualization of some patterns
voynichbombe > 24-01-2016, 09:59 PM
In the meantime Nick already took on the topic on his blog (and strongly rests his tounge in cheek about it).
I thought I'd nevertheless give my account of understanding on the methodology (and it's fallacies, imho) mentioned in the blog post. It should be noted that it is more a rough sketch of what could be done than a conclusive study. One will notice some learned critique in the comments.
Taking my meager learnings of AI/neural network training from waay back, I'd describe it as follows:
At first a shallow (hence "flat") artifical neural network is trained to find vectors (distance and direction) for sets of certain words and thus trying to uncover contextual relationships in a language unknown. This should work quite well, but largely depends on "normalization" of the input text. For example one would only choose fitting parts of the text which do not contain _any ambiguities.
It is already complicated here, because there are many assumptions that tune the output:
- EVA transcription
- natural language(s) which are culturally extinct
- the copying scribes were already out of knowledge of either the whole of the language, or a lot of it's details, hence "!" and "*" characters stand for unrecognized/ambiguous glyphs and lines containing it have to be ignored.
It gets even more assumptious when taking the next stage, "machine translation", which should also work quite well, given one already knows the meaning of some of the words in the unknown language. In this case it's the "star names" the author pretends (note he uses the term deliberately) to know.
So much so far from my side. I intend to consultate a friend of mine who is into autonomuous robotics and everything that comes along.. I also tried to invite the author, no luck so far.
While there might be only show stoppers for some of you, I think the approach is very interesting and should be investigated by more knowledgable peers.