19-10-2019, 02:38 PM
VMS Folio text matching.
Using cosine similarity on Takahashi transcipt with top20 commonest words removed.
cosine algorithm taken from here:
You are not allowed to view links. Register or Login to view.
Explanation of Cosine Similarity here:
You are not allowed to view links. Register or Login to view.
Cosine Similarity on sample text generated by text-generator.jar with default values.
Using T.Timms tool created sample text then edited it to match the set of words per voynich folio
also with top20 commonest words removed.
You are not allowed to view links. Register or Login to view.
In T.Timm image, the high values on the diagonal are data artefacts created when the text was split into folios.
Where a page is compared to itself the value is set to zero.
For visual reference, You are not allowed to view links. Register or Login to view. the folio with 3 words is page 114.
[attachment=3553][attachment=3554]
Using cosine similarity on Takahashi transcipt with top20 commonest words removed.
cosine algorithm taken from here:
You are not allowed to view links. Register or Login to view.
Explanation of Cosine Similarity here:
You are not allowed to view links. Register or Login to view.
Cosine Similarity on sample text generated by text-generator.jar with default values.
Using T.Timms tool created sample text then edited it to match the set of words per voynich folio
also with top20 commonest words removed.
You are not allowed to view links. Register or Login to view.
In T.Timm image, the high values on the diagonal are data artefacts created when the text was split into folios.
Where a page is compared to itself the value is set to zero.
For visual reference, You are not allowed to view links. Register or Login to view. the folio with 3 words is page 114.
[attachment=3553][attachment=3554]