The Voynich Ninja
[Article] New Article by Layfield and Davis - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: News (https://www.voynich.ninja/forum-25.html)
+--- Thread: [Article] New Article by Layfield and Davis (/thread-5751.html)



New Article by Layfield and Davis - LisaFaginDavis - 16-05-2026

I am thrilled to announce the publication of the first of two articles by myself and Colin Layfield (Computer Science, Univ. of Malta) about the application of Latent Semantic Analysis to the Voynich Manuscript! 

You are not allowed to view links. Register or Login to view.


RE: New Article by Layfield and Davis - rikforto - 16-05-2026

Excited to dig in, and wish to specifically thank you for getting it placed where those of us without university access can see it! I certainly never begrudge the other case in the "publish or perish" world, but this is always the more welcome outcome!


RE: New Article by Layfield and Davis - LisaFaginDavis - 16-05-2026

I'm absolutely committed to publishing in open-access journals when possible!


RE: New Article by Layfield and Davis - kckluge - 17-05-2026

(16-05-2026, 04:45 PM)LisaFaginDavis Wrote: You are not allowed to view links. Register or Login to view.I am thrilled to announce the publication of the first of two articles by myself and Colin Layfield (Computer Science, Univ. of Malta) about the application of Latent Semantic Analysis to the Voynich Manuscript! 

You are not allowed to view links. Register or Login to view.

Maybe it's just because it's "hot off the press", but the link to the PDF (You are not allowed to view links. Register or Login to view.) goes to a "The file you have requested cannot be found." page.


RE: New Article by Layfield and Davis - Grove - 17-05-2026

I’m curious if one wouldn’t expect a difference between the last paragraph of one page and first paragraph of the next while there should be a greater similarity between first paragraphs of each page in the herbals because I’d expect the context of first paragraphs to be similar about different plants.

I’m biased against the hoax theories so find the results promising.


RE: New Article by Layfield and Davis - LisaFaginDavis - 17-05-2026

(17-05-2026, 12:48 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.
(16-05-2026, 04:45 PM)LisaFaginDavis Wrote: You are not allowed to view links. Register or Login to view.I am thrilled to announce the publication of the first of two articles by myself and Colin Layfield (Computer Science, Univ. of Malta) about the application of Latent Semantic Analysis to the Voynich Manuscript! 

You are not allowed to view links. Register or Login to view.

Maybe it's just because it's "hot off the press", but the link to the PDF (You are not allowed to view links. Register or Login to view.) goes to a "The file you have requested cannot be found." page.

You can easily read the html version: You are not allowed to view links. Register or Login to view.


RE: New Article by Layfield and Davis - quimqu - 18-05-2026

Thank you Lisa. This work is very interesting.

I think your results may align well with my own study on automatic topic detection using different models (NMF, LDA, and BERTopic), discussed in this You are not allowed to view links. Register or Login to view.. I believe You are not allowed to view links. Register or Login to view. may be especially relevant, as it clearly shows how the detected topics are strongly separated by section.


RE: New Article by Layfield and Davis - RenegadeHealer - 22-05-2026

(16-05-2026, 04:45 PM)LisaFaginDavis Wrote: You are not allowed to view links. Register or Login to view.I am thrilled to announce the publication of the first of two articles by myself and Colin Layfield (Computer Science, Univ. of Malta) about the application of Latent Semantic Analysis to the Voynich Manuscript! 

You are not allowed to view links. Register or Login to view.

From paragraph 39:

Quote:Another finding that requires explanation is that the faux-Voynich generated text showed a very similar pattern to the Voynich Manuscript results in that they have the same relative “shape,” although the magnitude of the generated Voynich scores was larger (see Figure 5 above).


I notice that the Coherence Test results for Dante’s Inferno, unprocessed and shuffled, are very much within the ranges of those for the faux-Voynich generated text, on all metrics. I suspect the difference isn’t statistically significant. The former is a known meaningful text, while the latter is a known meaningless text. Therefore, Unless I’m missing something here, the statistical tests you and your colleagues ran do not reliably distinguish meaningful from meaningless text.

Quote: As the artificial Voynich text follows the “self-citation” hypothesis put forward by [You are not allowed to view links. Register or Login to view.], this might have been viewed as evidence in favour of concluding that the manuscript has no semantic content; however, this is offset somewhat by our findings regarding transitions from Language A to Language B (see Figure 8 above). Their explanation of the self-citation hypothesis states that the difference between Language A and B is a natural side effect of the self-citation/modification of the text by the scribe over time.

@Torsten or Andreas will have to clarify this. But my understanding of their notion of a smooth transition from Currier A to Currier B, is more along the lines of the lines of the work of @DonaldFisk: 2D and 3D principal component analysis dot plots based on the folios’ textual tendencies, each dot representing a folio, shows that certain folios are much more closely related than others, but form a single continuous caterpillar-shaped cloud. In other words, Currier A to B forms a seamless gamut, with data points more evenly spaced than clustered, after folios are arranged according to their statistical similarity.

Quote:The startling difference between A/B in our experiments, however, show this change to be very abrupt and occurs mostly in the same section (Herbal) which would seem an odd place for this to occur.You are not allowed to view links. Register or Login to view. Our experiments showed the opposite in the sensitivity to the transitions between A and B, suggesting there is a real difference between them. It is also worthwhile to note the discourse segmentation scores suggest that some sections contain a higher sense of internal coherence than others (which makes sense, given the illustrative content) — if text were truly artificially generated using a method similar to what [You are not allowed to view links. Register or Login to view.] proposed, this would seem to be an odd behaviour as we would expect a more consistent sense of coherence going through the text, not what we are witnessing with the Voynich.

I hope I’m just way out of my element here, but I think I see a simple explanation for this.  If my memory of Timm & Schinner 2023 serves me right, you and your colleagues are comparing apples to oranges here. You’ve noted abrupt changes in folios in the order they are currently bound. And you’ve found them in exactly the places where one would expect to find them: where Herbal B bifolios were intercalated into Herbal A quires, by some later owner who couldn’t tell the difference. And, at A-section and B-section boundaries. Much investigation, by @WladimirD and others, has explored what the original binding order likely was. Or the intended binding order…

... if what became the VMS was originally even intended to be bound in the first place! Weren’t you involved in that recent study suggesting that many measurable text properties cluster most strongly within the four pages of each bifolio? There was some speculation, based on this, that the VMS was originally designed and used as a stack of pamphlets, meant to be selected from one at a time, or passed around and shared among a group of people. Think of modern-day professionally published packs of didactic materials, like educational flashcards, premade classroom lesson plans, Dungeons & Dragons quests, instructions for breakout or Jigsaw learning groups, you get the idea. These were a thing in the Middle Ages; there was some mention of a centuries-old box or satchel of Islamic religious educational materials, written on stacked and sequenced sheets of parchment, found in the ruins of a library in Timbuktu, Mali. In the olden days, book binding was laborious, expensive, and not always even practically desirable. Quite a lot of handwritten and hand-copied works on parchment were kept unbound indefinitely, whether for thrift or practicality.

I’m really going out on a slender limb here, but if Bifolio-as-a-functional-unit is robustly supported, and the possibility that the VMS was originally a set of pamphlets becomes more likely, then in theory it should be possible to recreate the original order of the pamphlets — the order in which they were created, the order in which they were stored and used, or both — by quantifying which functional bifolio units are closest to which others statistically. The original order of the bifolios would be even more strongly supported if the relationships of similarity appeared to be a linear chain, with each bifolio (except the first and last) showing peak similarity with only two others. With this chain or cline reconstructed correctly and laid end-to-end, your top shelf work on the scribal hands, and @Koen G’s extensive work on the imagery themes, and @DonaldFisk’s smooth cloud of dots, should click right into place, and make even more sense. We might even be able to speculate on where in the sequence the missing bifolios probably went, what Currier language they were probably in, and what thematic section they likely belonged to. On the other hand, The absence of so many of them could just as easily prove problematic to a complete reconstruction of the original bifolio order.

That said, I ween we would not necessarily be (as I don’t think we are now) any closer to answering the question of whether the VMS text encodes any meaningful information.


RE: New Article by Layfield and Davis - Koen G - 22-05-2026

Some parts of the paper are a bit technical for me to grasp entirely, but I understand it as mostly confirming things we know:
  1.  Currier A and Currier B are different.
  2.  Sections are different from one another, for example we have long known that circular text behaves differently, and Q13 has even lower "variation" than other sections (TTR, entropy).

There may be interesting findings wrt generated text, but I am unable to assess those.

I suspect that the upcoming paper will be more revealing: to what extent does the method actually support the singulion hypothesis?


RE: New Article by Layfield and Davis - bi3mw - 22-05-2026

(offtopic)

Quote:In order to compare the similarity of two documents, we calculate the cosine between their vectors (akin to columns in ); the higher the cosine, the greater the similarity.

This confirms, incidentally, that my approach to detecting similarities between lines within the VMS wasn't entirely off the mark. The section boundaries can indeed be clearly identified (especially in Quire 13).

You are not allowed to view links. Register or Login to view.