Very interesting.
Koen, could you create such graph from your modified high entropy VM text? It should have novel bigrams, how are they distributed across spaces?
Reading this thread I think two really useful things could be done:
1. As Marco suggests, have some automated way of identifying or measuring gaps between glyphs. This would help to clarify exactly what we mean by ambiguous spaces. Whether there is a continuous variation from "definitely adjacent" to "definitely a word break", or a distinct U-shaped distribution where very few gaps are intermediate.
2. Regarding Patrick's observation, is there some way of distinguishing between ambiguity which we see (or measure), and ambiguity which the writer/reader would have seen? That is, is "ambiguity" a concept applying only for researchers, or for the script/text itself?
Although I acknowledge this does not help the situation in any way, shape, or form, I do think it is worthwhile to note that this ambiguity could also run with the scribe and/or encipherer. That is, you would imagine that, depending on how precisely the process is working (through wax tablets or loose sheets or direct from plain text, etc.) a certain "expectation" for word structure would be built in for the scribes and/or the encipherer as well.
Finally, how much consistency for spaces (something that arguably hadn't yet been "built in" to the visual tradition) could truly be expected at this time and place?
This work was definitely useful to strongly connect a spacing certainty level to particular bigrams -- but whether this is a function of the way that alleged plaintext was transcribed or the way the cipher text was transliterated is difficult to say. I would suspect there's a bit of both involved. I am going to keep thinking if anything else concrete can be concluded -- but just wanted to add this admittedly frustrating observation into the mix.
EDIT: In the interim, I see this issue has been brought up by Emma as well, in her part 2.
Do the ambiguous spacings tend to cluster to a particular 'Scribe'?
Does this tend to be an idiosyncrasy of a particular 'Hand'?
(23-05-2022, 07:27 PM)R. Sale Wrote: You are not allowed to view links. Register or Login to view.Do the ambiguous spacings tend to cluster to a particular 'Scribe'?
Does this tend to be an idiosyncrasy of a particular 'Hand'?
That's a very good question. But it might be hard to say for certain without computer imaging.
(23-05-2022, 07:27 PM)R. Sale Wrote: You are not allowed to view links. Register or Login to view.Do the ambiguous spacings tend to cluster to a particular 'Scribe'?
Just counting commas in ZL2a: (if i have allocated folios to scribes correctly

)
s1 765
s2 749
s3 900
s4 262
s5 61
765+749+900+262+61 =2737 Commas
Then Scribe_3 is the winner
This plot shows the % of certain (top) and uncertain (bottom) spaces for each page (r and v processed together, paragraph text only, ZL_ivtff_1r.txt). The sum of % certain + uncertain + adjacent(not shown) = 100% for each page. A value of 0 corresponds to pages that are either missing or have no paragraph text.
Certain spaces show a slightly decreasing trend, due to the fact that Currier B words are averagely slightly longer than Currier A words (longer words = less spaces).
Both lines are "smoother" from Q13 onwards: this could be due to the presence of more text in those pages, hence more consistent measures. Also, f26-66 have been arranged in such a way that Currier A/B pages by different scribes are mixed together.
Some consistent sections by the main scribes:
- f1-25 assigned by Lisa to S1,
- f75-84 (Quire 13) to S2,
- f103-114 (109-110 lacking) to S3.
I cannot see any significant differences in addition to the two trends mentioned above.
As always, it is possible that I messed up something: be careful.