(15-10-2025, 11:53 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I have a first set of boxes along the direction of writing, but the word boundaries are not yet very accurate. Would you be interested in using these?
Rene very graciously provided me with his data, so I reran my analysis over the weekend. Thanks to Rene's dataset containing the direction of writing,
k-means clustering prefers two clusters rather than three, as the "Scribe 4" cluster's aspect ratio has (correctly) risen, now that I can correct for non-horizontal word orientations.
You could think of this analysis as a fancy, quantitative confirmation of the common observation that the writing associated with Voynich B tends to be smaller than the writing associated with Voynich A. Here are four plots all of the same data, one coordinate per bifolio, with each plot varying in its labeling so you can see how each trait varies across the clusters:
As you can see, my Cluster 2 is mostly consistent with Voynich B, Zandbergen's B and C languages, and Davis's Scribes 2–5, while my Cluster 1 is mostly consistent with Voynich A, Zandbergen's A and Ae languages, and Davis's Scribe 1.
These plots are generated using all words with a minimum of 10 occurrences across the VMS. But I can also do plots for individual common words, too. Here's what happens when we look solely at
daiin: