The next paper we're going to review is going back to, in my view, a relatively practical use of biocodicology for the Voynich -- the possibility of determining where the cows that were used to make the parchment came from. That's not to say these less practical goals are not useful to talk about -- but I'm going to examine more in depth approaches that have in my opinion the best likelihood to provide immediately useful data: namely matching the DNA characteristics of the source animals for the parchment to a geographic database.
Here is the full cite of the paper:
The York Gospels: a 1000-year biological palimpsest
Matthew D. Teasdale , Sarah Fiddyment , Jiří Vnouček , Valeria Mattiangeli , Camilla Speller , Annelise Binois , Martin Carver , Catherine Dand , Timothy P. Newfield , Christopher C. Webb , Daniel G. Bradley and Matthew J. Collins
R. Soc. open sci.4170988170988; You are not allowed to view links.
Register or
Login to view. (October, 2017).
The full paper is available You are not allowed to view links.
Register or
Login to view..
Some important aspects of this paper from the point of view of the Voynich are:
i)
What is being analyzed? The York Gospels, a 1000 year old manuscript (You are not allowed to view links.
Register or
Login to view.) - so significantly older than the VM, based on the carbon dating
ii)
How are the samples being collected? Non-invasive PVC eraser crumbs. Enough for ZooMS for all folios, enough for DNA sequencing for eight.
iii)
What techniques were performed? ZooMS on almost all folios, next generation DNA sequencing
from the eraser crumbs on eight folios, DNA damage analysis on those same eight folios, and SNP analysis of three folios
I'm not going to go over all the experiments at a close level, as they are generally applications of techniques we've seen before. But here are some unique conclusions that should be kept in mind related to DNA quality that could be attributed to the eraser crumb sampling technique:
1. On average, about 19% of the DNA results aligned with a genome from a "source species" of the parchment - however the range for the eight folios were from .7% to 51.4%. They examined the 51.4% results more carefully and when quality standards (e.g. reads that can be mapped with high confidence) were applied to the reads,
this fell to 5.6%, much more loss than expected.
Such an amount of reduction in "quality reads" could be due to a) selective preservation of DNA type during parchment production (thus, this issue is still possible) or b) selective sampling due to the eraser crumbs approach.
Background info needed
Note that DNA is present in the cell wrapped around packaging proteins called histones. You are not allowed to view links.
Register or
Login to view. is more information about how histones work and a cartoon.
When the DNA is being used to produce protein, the association with histones necessarily is reduced (in fact, this process is thought to be part of the regulation of protein production).
The authors of this paper hypothesize that when producing the parchment, those DNA sequences that are not associated with histones (called euchromatin) may be selectively destructed compared to that which is associated with histones (called heterochromatin). Unfortunately, it is euchromatin sequences (e.g. those being used to produce protein) that provide the most likely ability to distinguish between one species and another. Using proxy sequences, the difference in coverage for folio 125 was figured and the greater coverage for heterochromatin supports this hypothesis, see Supp. Figure 2, below.
However, as a practical note, there was enough data to make a positive identification of the species for all the folios (except for those that had undergone a destructive conservation process, covering the folios with silk gauze) between the ZooMS results with confirmation by the DNA results for the eight folios tested (see, Figure 1).
Note that because destructive sampling could not be done, the possibility that the selective bias toward repetitive (histone associated) DNA sequences was due to the sampling technique could not be eliminated.
2. But on the positive side, from a DNA end deanimation point of view (a chemical sign of ancient DNA degradation), parchment appears to be a better conservation environment than inside bones as less of this type of damage was seen as compared to what would be expected for DNA of the same age isolated from bone. Interestingly, the human DNA isolated (see discussion below) had less DNA damage than the "source species" DNA isolated, lending further support to the "parchment production is damaging" hypothesis (but it also could be just newer date).
3. Of the eight folios where DNA sequencing was done,
three of these had sufficient results to undergo SNP analysis, through to a modern database of cow sequences. Figures from the You are not allowed to view links.
Register or
Login to view. of the paper are reproduced below.
Here's the world wide association (DNA from the three folios are the triangle, circle, and square) showing general association with European cow genus (Taurus).
Here's a close in view of the European mappings -- colors are geography and the three letter data points are keyed to breed names -- (ANG=Angus, BLO=Blonde d'Aquitaine, BSW=Brown Swiss, CHL=Charolais, GNS=Guernsey, HOL=Holstein, JER=Jersey, LMS=Limousin, MON=Montbeliarde, NOR=Normande, NRC=Norwegian Red Cattle, PMT=Piedmontese, RGU=Red Angus, RMG=Romagnola, SIM=Simmental).
So -- none of the three hit exactly on any modern cow species -- but that wouldn't really be expected, given the 1000 year time span for evolution and the intense amount of cow breeding that has occurred in that time. But it definitely shades toward the known geographic location of Northwest Europe.
Just a quick side note that this kind of data that doesn't overlap with anything exactly is a bit reminiscent of the various Voynich text comparisons to all the world wide languages -- but that could just be the frustrating experiences talking.
4. A couple of quick conclusions for those whose biggest interest is in human sequences: an average of 11% (range of 4.2%-20.8%) of the aligned sequences were human and this was about twice the amount of human DNA as archived legal documents sampled in the same way. Note that applying quality standards to one set of the best of the human reads dropped the percent from 19% to 16% (much less of a drop than above for the source species reads) which is more support for the "parchment process degrades DNA" issue. Also, more human DNA was found on pages where "oaths" that were periodically administered were found -- supporting that the more often a page was referenced, the more human DNA was found (even when that use has been long in the past).
TLDR:
The potential destructive processes of parchment production looms yet one more time as it is clear that next gen DNA sequencing alone likely isn't going to fix that issue completely. However, a non-destructive sampling technique was able to sometimes get enough DNA to do a SNP analysis -- but much more comparative data is likely needed for true geographic pinpointing. Other dated, known geographical historic parchment is probably the most relevant source for such comparative DNA sequences.
I believe it is now clearer what kind of additional data would be needed in order to get a much higher chance of meaningful results for the VM. We need DNA results from as many as possible manuscripts from around the carbon dating where it is known what geography the manuscripts were produced. I know that this is a tall ask -- as my impression of manuscripts from the early 1400s is that precise information is often lacking. But it might take less information than presently thought, if there are clear differences in the cow species of that time period -- which is definitely possible.
A piece of hope is the observed phenomenon of "population isolates" in DNA sequences. This is where isolated populations have evolved very specific patterns that are not found elsewhere. You are not allowed to view links.
Register or
Login to view. is a paper discussing this phenomenon in relation to human populations and the Basque, which unfortunately contradicts that such population isolates even exist -- but things may be different for cows. Thus, it is possible that if a precise match could be found that is a best possible outcome (low probability, but still) -- but such a result will definitely not be known until it is tried.
Along with more data, we need a bit of luck . . .