The Voynich Ninja
[Article] New Paper: Subtle Signs of Scribal Intent... - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: News (https://www.voynich.ninja/forum-25.html)
+--- Thread: [Article] New Paper: Subtle Signs of Scribal Intent... (/thread-4249.html)

Pages: 1 2


New Paper: Subtle Signs of Scribal Intent... - asteckley - 26-04-2024

Our recent paper, “Subtle Signs of Scribal Intent in the Voynich Manuscript” may be of interest to those of you analyzing the Voynich text for its possible underlying language and meaning.

The preprint version can be found on ArXiv:  You are not allowed to view links. Register or Login to view.

Abstract:
“This study explores the cryptic Voynich Manuscript, by looking for subtle signs of scribal intent hidden in overlooked features of the “Voynichese” script. The findings indicate that distributions of tokens within paragraphs vary significantly based on positions defined not only by elements intrinsic to the script such as paragraph and line boundaries but also by extrinsic elements, namely the hand-drawn illustrations of plants.”

The paper is a bit technical, so here is a summary of the more  interesting results:
  • Certain word tokens exhibit a propensity to occur –or to be avoided– in certain positions such as the top line of paragraphs or at the beginning or ends of lines. That is not too surprising as it’s been observed to some extent before.
  • The more surprising find is that there is also a propensity for certain word tokens to occur immediately before, or immediately after, the hand drawn plant illustrations.

The propensities were analyzed in detail  to ensure the statistical significance.
A reference catalog of word tokens with propensities was compiled. Only a couple of the tables could be included in the paper due space limitations, so below are a few more of them.
The whole catalog of tables is included in the Supplemental Online Material at:
You are not allowed to view links. Register or Login to view.

Note that the entire analysis was restricted to the portion of the manuscript believed to be written by a single scribe (Scribe 1 as identified by Lisa Fagin-Davis).


RE: New Paper: Subtle Signs of Scribal Intent... - bi3mw - 26-04-2024

Quote:.....
This implies they must be due to some underlying causal mechanism that is related to the token’s position, although the analysis itself cannot reveal what that mechanism is.

Is it correct that, if there is an underlying mechanism, it suggests the generation of text rather than content-related text ? In other words, shouldn't a text consisting only of meaningful content be free of the characteristics described above ?


RE: New Paper: Subtle Signs of Scribal Intent... - asteckley - 26-04-2024

(26-04-2024, 11:13 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Is it correct that, if there is an underlying mechanism, it suggests the generation of text rather than content-related text ? In other words, shouldn't a text consisting only of meaningful content be free of the characteristics described above ?

I think it is possible to be meaningful content and yet still have certain tokens chosen more often (or less often) by the scribe simply because of he wants to to work around the drawings or margin lines.   The scribe might, for example, be inclined to use a shorter word with an equivalent meaning -- or an abbreviated form of a word -- when there is not enough space left to accommodate  the original word.  The likelihood of this might be strengthened if we observed that word tokens with an affinity for positions just before a drawing (or right hand margin) were all shorter than average (and aversive ones were longer than average).  But that is not apparent to any significant degree in the current data; looking at the tokens BEFORE a drawing in particular, there is no apparent correlation of the length of the token and its type of propensity (i.e. tilt).

On the other side, say the content is meaningless (whether generated by some rules that produce statistical structure or not), then the scribe is not even trying to use words for their meaning or to consistently follow any rules, so he may at any time just select a token of suitable length to achieve aesthetic effects. There too though, we might expect to see some more distinct correlation with length.

I am in the process of compiling a database of the physical widths of words (and glyphs) and of the spaces between words, between words and drawing, between words and margins, etc.    This may provide data for other related analyses of these ideas.


RE: New Paper: Subtle Signs of Scribal Intent... - kckluge - 27-04-2024

This is a really nice piece of work, congratulations on it.

I had done some similar word length distribution analysis on the Bio section ('cause also a single dialect and scribe) that I never wrote up for the forum. It looked at the N-th-but-not-last and last words on lines, although I didn't get around to doing chi^2 values. Because the length distributions are long-tailed to the right, I used the mode rather than the mean for the "average" length -- summary of the results was:

Types: mode at L=5 for N <= 8 and last word on line, at L = 4 for N = 9 & 10 (relatively few lines have more than 11 words)

Tokens: for N = 1 modes at L = 3 and L = 5; for N = 2 to 4 mode at L = 5; for N = 5 to 10 mode at L = 4; last word on line mode at L = 3

Karl


RE: New Paper: Subtle Signs of Scribal Intent... - kckluge - 28-04-2024

(27-04-2024, 08:39 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.This is a really nice piece of work, congratulations on it.

I had done some similar word length distribution analysis on the Bio section ('cause also a single dialect and scribe) that I never wrote up for the forum. It looked at the N-th-but-not-last and last words on lines, although I didn't get around to doing chi^2 values. Because the length distributions are long-tailed to the right, I used the mode rather than the mean for the "average" length -- summary of the results was:

Types: mode at L=5 for N <= 8 and last word on line, at L = 4 for N = 9 & 10 (relatively few lines have more than 11 words)

Tokens: for N = 1 modes at L = 3 and L = 5; for N = 2 to 4 mode at L = 5; for N = 5 to 10 mode at L = 4; last word on line mode at L = 3

Karl

By the way, quick clarification -- those experiments used the Currier alphabet rather than EVA. Also, my preliminary results not only need to be checked for statistical significance via chi^2 values (given the sample sizes, I'd be surprised if they're not), but also need to be checked by looking at different manuscript sections and different transcriptions to see if the same effect occurs.

I want to push back slightly on the use of the term "intent" as implying that the scribes are choosing shorter words to make them fit before the drawing element or for some reason achieve a given line length. If one thinks (as I do) that spaces are inserted in some algorithmic way and that the algorithm operates at the level of lines (or interrupted lines), then last words will tend to be shorter for no other reason than because they're the left-over bit of text at the end.

I think the importance of last words on lines being shorter on average needs to be emphasized, because any theory regarding how the text was generated needs to explain it. Consider something like Rugg's proposed grill method -- I'm not sure why there would be anything special about last words on a line if that was the mechanism.

If it's really the case that average word length shifts based on word position in the line (correcting for the effect of the result regarding words-before-drawing-elements from asteckley's paper), that's potentially a whole other world of hurt for everybody's hypotheses.

Karl


RE: New Paper: Subtle Signs of Scribal Intent... - asteckley - 28-04-2024

(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I want to push back slightly on the use of the term "intent" as implying that the scribes are choosing shorter words to make them fit before the drawing element or for some reason achieve a given line length. If one thinks (as I do) that spaces are inserted in some algorithmic way and that the algorithm operates at the level of lines (or interrupted lines), then last words will tend to be shorter for no other reason than because they're the left-over bit of text at the end.

I understand what you are saying there. And that particular interpretation of the word intent did occur to me.

But my use of the word intent was meant at a "higher level" so to speak.  So at a lower-lower level there might be (or not be) an "intent" to choose a short word. And, as you suggest, that kind of intent isn't really a factor if the feature is a result of an algorithmic process -- or what I refer to as a "simulation device". But at a higher-level, "intent" refers to the intention to convey meaningful content vs the intention to make a document that only appears to be be meaningful content. (If the latter is the scribe's intent, then one way to do it would be to use an  algorithm/simulation-device.)


RE: New Paper: Subtle Signs of Scribal Intent... - asteckley - 29-04-2024

(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I think the importance of last words on lines being shorter on average needs to be emphasized, because any theory regarding how the text was generated needs to explain it. Consider something like Rugg's proposed grill method -- I'm not sure why there would be anything special about last words on a line if that was the mechanism.

Actually, I can imagine some version of the grill method that might produce anomalies in the length or choice for the first or last word on a line. I believe Rugg's  main result was in showing that a simulation-device (like one inspired from the Cardan grille) could produce various structure and do so rather easily, but that specific device wasn't required -- it could have been any number of different variants of the the one he showed. 
So the rules of the device might have had some relation to lines of script.  

But it is a greater stretch to imagine how or why such a device could produce weirdness related to the drawing intrusions.

On the other hand, for both scenarios -- that of using some simulation device, or of actual meaningful language, it is plausible that the scribe could simply depart temporarily from any rules or precise meaning that might otherwise be dictated, and just choose a word that is shorter or longer for whatever reason.  But one would think that there would be less flexibility to do that in the case of writing meaningful script (because he still has to maintain the conveyance of some prescribed meaning).

(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.If it's really the case that average word length shifts based on word position in the line (correcting for the effect of the result regarding words-before-drawing-elements from asteckley's paper), that's potentially a whole other world of hurt for everybody's hypotheses.
That is true. I know those the statistical tests are robust and reliable and based on solid assumptions, and I know that the "data is the data" and the results have to be accepted for what they say.  Nevertheless, even I still find them surprising and hard to accept. Why should the drawings have any such effects on the token choices?  

(One can expect, by the way, that given the probabilistic nature of the statistical tests, there will be some small number of false-positives that sneak through. For example, a p-value test using a threshold of 0.01 indicates a 1% probability that the observed data cannot be explained by random sampling effects. But that means one can expect that about one test in 100 will pass the test when it really shouldn't.  Now I didn't discuss how any of that affect the results in the paper because there just wasn't enough room --the paper was limited to 10 pages-- and adding that extra layer of complexity in the overall description of the study results didn't seem worth while. But suffice to say, the small number of expected false positives does not change the overall results.)


RE: New Paper: Subtle Signs of Scribal Intent... - pfeaster - 29-04-2024

Interesting!  See also Marco's study of textual patterning around image intrusions here:

You are not allowed to view links. Register or Login to view.

Thinking further about how positional patterns seem to extend throughout the whole text rather than affecting only its boundaries, I wonder whether factoring in something like plantwardness alongside rightwardness and downwardness would reveal anything worthwhile.


RE: New Paper: Subtle Signs of Scribal Intent... - asteckley - 29-04-2024

(29-04-2024, 02:54 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Interesting!  See also Marco's study of textual patterning around image intrusions here:

You are not allowed to view links. Register or Login to view.

Thank you for pointing out Marco's work!

I wish I had found it prior to writing our paper so I could have included a reference to it.

(I had searched and inquired about anyone trying to analyze the feature, but had only found that of Julian Bunn.)

I should point out, as I did in the paper, there is a distinct difference in prevalence vs propensity. Looking at prevalence (i.e. counts of tokens) is a useful first pass analysis, but it can also be misleading.
I also wanted to explain that using a comparison to some results from previous work. Unfortunately, Julian's lists of tokens found in different positions happened to use a portion of the manuscript that was mutually exclusive to what we used.


RE: New Paper: Subtle Signs of Scribal Intent... - asteckley - 30-04-2024

I prepared a couple of presentations of the findings from this paper in a different form than shown in the paper or the supplemental online material.
They are both attached below.

One shows all of the tokens that were found to have a statistically significant propensity (affinitive or aversive) and how they compare across the five subject positions analyzed in the study. 

The other shows all of the affinitive tokens arranged in a Venn diagram. This one is interesting as it highlights the fact that there are three mutually exclusive sets of tokens: those with affinity for the TOP position, for the FIRST and AFTER positions, and for the BEFORE and LAST positions.   And it further shows that the latter two pairs of positions each share some tokens between them.

There are other interesting patterns seen in that Venn diagram as well.

I am leaving the proposed interpretations and implications of all this for a future publication, but would be interested to hear any thoughts from you all on what it all means.