I posted You are not allowed to view links.
Register or
Login to view. about the 'similarity' of Voynich glyphs with each other and I'd like to elaborate further on that. Not sure if it's useful and/or a new thing (I doubt), but anyway..
For each glyph I calculate the frequency distributions of the preceding glyph: this gives a vector of numbers (adding up to 1) for each glyph. Then I calculate the Euclidean distance between each couple of vectors (root-mean-square: the square root of the sum of the squares of the differences). I do the same for the distributions of the following glyph. By construction, each distance can be as a minimum zero, and as a maximum SQRT(2) =~ 1.414
What does this mean in practice? If we find that EVA 'k' and EVA 't' have small distances, both of them, and, say, we find the sequence 'oka' in the text, then it's probable we'll also find 'ota', and moreover the ratio between 'oka's and 'ota's will probably be similar to the ratio between 'k's and 't's.
These are the most similar glyph couplets, as measured by the average of distance_previous and distance_following, considering the whole RF1a-n transcription and excluding rare glyphs (defined as all the glyphs with a frequency lower or equal to that of EVA 'g'):
If you want to consider also the rare glyphs, add the following couplets:
For reference (excluding rare characters), the two most dissimilar glyphs are, unsurprisingly, 'q' and 'n'. Average distance = 1.39, previous = 1.39, following = 1.38. Almost maximally orthogonal.
Notice: the above analysis considers 'ch', 'sh', 'ckh', 'cth', 'cph' and 'cfh' to be stand-alone glyphs. This is arbitrary of course (but I think there are good reasons for it). Also, there was some manual work involved in creating the results tables, so excuse for any errors or omissions.
When I can, I'll try to get the same data for each section of the VMS.