Hello Marco,
Thanks for your reply. You have much more experience with the Voynich, and I am a novice at it.
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.If we take the illustrations as indicative of a topic, the change in statistics does not appear to be due to a different topic, but to a different scribe.
Please note that I don’t claim to know whether what the model detects corresponds to a “topic”, a “dialect”, or a “writing style”. What I can confirm is that the model identifies a
mixture of these components within many paragraphs. That’s why in the following plots the colors appear softer rather than purely red or blue for those mixed paragraphs.
This indicates that many words are shared between the two distributions, suggesting that the underlying components are not completely distinct. In my opinion, this could point to closely related languages or styles, or to genuine topics, in the sense that certain words are more typical of specific semantic contexts, while still co-occurring with shared vocabulary across the manuscript.
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Hi quimqu, as I pointed out You are not allowed to view links. Register or Login to view., this appears to overlap with the ongoing research by Lisa Fagin Davis and Colin Layfield. We know that quires were put together in an at least partly arbitrary way; trying to understand more of the order in which the ms was created is extremely interesting.
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.If I understand correctly, Lisa and Colin are going even more in-depth, also considering the stains that affect many of the pages; they are working on bifolio-level reordering, rather than section-level. I expect that their paper will be a major step forward in our understanding of the structure of the text.
Yes, I attended Lisa's presentation. I was surprised and happy to see that, apart from the study of stains on the pages, another topic-related study was ongoing. I’m looking forward to seeing the results and checking how wrong I am. By the way, if Lisa reads this, I’m open to collaborating if they need any kind of work done.
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.We know what is happening here since Currier’s analysis half a century ago: herbal bifolios by different scribes were mixed and bound together. This is detailed in You are not allowed to view links. Register or Login to view. (Table 1). All pages from f1 to f25 (“the first 1/3 of the herbal”) were created by scribe 1. After that point, scribe 2 pages begin to be inter-mixed with scribe 1 pages. If we take the illustrations as indicative of a topic, the change in statistics does not appear to be due to a different topic, but to a different scribe.
Right. My model shows this quite clearly. Even if there are some mixed paragraphs, it has separated the bifolia by topic quite well (sorry for the large image, but if I make it smaller, it becomes unreadable):
[
attachment=11759]
Here we can see that all herbal bifolia are "language A" except for bifolia D2 (f26r, f26v, f31r, f31v), E1 (f26r, f26v, f31r, f31v), E2 (f34r, f34v, f39r, f39v), F1 (f41r, f41v, f48r, f48v), F3 (f43r, f43v, f46r* this one is marked as lang. A, f46v), G2 (f50r, f50v, f55r, f55v), and Q2 (f94r, f94v, f95r, f95v). And our beloved bifolio H1 (f57r, f66v), which is really mixed.
I have grouped all the bifolia in the MS, and here is how my model determines the "languages":
[
attachment=11756]
Most bifolia seem to have a unique "language", but you can see that some are quite mixed.
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.The CUVA bigram plots You are not allowed to view links. Register or Login to view. (or before) are a good match for your topic line. See bottom of the page.
null
E.g. the plot for ‘ed’ shows how the zodiac section (gray) appears to gradually shift from Currier A (bottom) towards Currier B (top). It also shows that Quire13 Bio is more strongly “B” than Quire20 Star-Paragraphs. It also shows that Pharma (yellow) is comparable with Herbal A pages (both HA and Pharma are attributed to Scribe 1).
Right. This is already shown by the model (note that I use the quire notation from the EVA file):
[
attachment=11758]
You can see some quires where the language is mixed, suggesting that maybe the bifolia don’t correspond exactly to the quire. At first sight D, E, F, G, H (?), I (?), Q
(20-10-2025, 06:29 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.EDIT: another point that I think Rene mentioned in the past. Results based on only a few samples are more noisy and unreliable than results based on larger sets. This could play a role in the fact that Bio/Q13 paragraphs get more consistent results than the much shorter Stars/Q20 paragraphs.
Even though I agree in general terms, I don’t fully agree regarding topic modelling. Topic modelling is intended to be applied at the sentence level. The shorter Stars paragraphs are actually the perfect size, whereas the longer herbal or biological paragraphs might be too long. I suppose (though I can’t confirm it) that the longer paragraphs should contain internal sentences, but since there is no punctuation, we can’t recognize them yet.