Lines interrupted by drawings - Printable Version

Lines interrupted by drawings - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Lines interrupted by drawings (/thread-2945.html)

Pages: 1 2 3

RE: Lines interrupted by drawings - MarcoP - 28-09-2019

(26-09-2019, 02:13 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view. Marco, was the raw data for this study one of the ingredients for the project you and Emma published earlier this year?

Hi RH,
the subject is quite similar to You are not allowed to view links. Register or Login to view., but the data presented here were computed from scratch. The Zandbergen-Landini transcription was indeed used for our paper too.

(26-09-2019, 02:13 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.I was immediately reminded of Julian Bunn's demonstration that the different plant-delineated text columns on You are not allowed to view links. Register or Login to view. show different ink density, suggesting that they were written separately, and thus are true columns functionally and methodologically. Your data here supports the idea that text before and after a plant drawing constitutes two separate lines, not one interrupted line. As someone who has looked at a lot of old manuscripts, how precedented was this style of column composition in medieval manuscripts of the time?

I was not familiar with that page, thank you for mentioning it! Bunn credits the idea to Jim Reeds. Personally, I don't find it very convincing, though everything is possible of course. I believe the matter should ideally be assessed by a palaeographer. As far as other manuscripts go, similar layouts to the VMS herbal are not frequent. Some examples were discussed by Koen You are not allowed to view links. Register or Login to view.. I am not 100% sure about the Greek manuscripts, but the Latin and Italian manuscripts he mentions all have lines continuing across illustrations. The exception is Munich Cim. 79, which however is entirely arranged in two accurately defined columns.
Something comparable with what Reeds thinks happens in the bilingual page I discussed You are not allowed to view links. Register or Login to view. (two distinct columns at the sides of the image, a single column at the bottom). I believe this case is quite exceptional. Moreover the text, however long, should be regarded as a "marginal" later addition (this is also the case for some of Koen's Greek manuscripts).

(25-09-2019, 08:06 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.When a word encounters an image and the text wants to continue on the other side, there are a few options:

1. the word is split
2. the final complete word before the image is squeezed in (or the scribe takes into account his spacing to make it fit well)
3. the final word before an image is truncated

There are probably more options, but some of those are less likely. For example that an image splits text into various columns like in a newspaper.

You say that words before an image are shorter, while words after an image are the same. This rules out (1), because in that case words after the image would also be shorter.

Hi Koen,
I don't think it's granted that the normal length of words after the image break excludes that words can sometimes (or even often) be split by the image. I computed some statistics on a modern English text with hyphenation (You are not allowed to view links. Register or Login to view.).

I vaguely remember running a similar experiment years ago, with a different text: possibly I posted it here, but I am far from sure. In this text, about 1/6 of the lines end with a hyphen. The result is that the first word of a line is averagely longer.
Average lengths:
all 4.224
first 4.632
last 4.194

Filename: L2.jpg Size: 33.17 KB 28-09-2019, 08:08 AM

I believe this counter-intuitive result is due to two reasons:
1. short words can be easily forced into the end of a line, so they tend to appear there instead of at the beginning of the next line;
2. (in English) short words tend to alternate with long words: if short words frequently appear at the end of a line, the first word of the next line is more likely to be long. (This is rather speculative: I think it should be possible to compute numbers confirming or rejecting the idea, but I did not try to)

The histogram of individual word-lengths shows that that the first word of a line has both fewer short types and more long types than average. The last word of a line has more short types. Something similar might happen with Voynichese words split across an image.

(26-09-2019, 07:31 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.The occurrence of [s] alone before a mid-break are interesting in relation to line start patterns. We see an increase at the line start of words beginning [s] which is often followed by [a] or [o]. It would be good to learn is these lone [s] matched up with [o] or [a] on the other side.

If so (and I do not know if it is) it would make an stronger link to line start patterns.

Hi Emma,
since there only are 24 occurrences of stand-alone s before an image-break, the statistics are not very reliable. Here the blue bar corresponds to the first character of the word after a stand-alone [font=Eva]s[/font] followed by an image-break ([font=Eva][font=Eva]s[/font][/font] <-> X-); the orange bar is the first character for all words after an image break (<-> X-); the yellow bar is the first character in any word (X-), the green bar is the the first line-initial character (NewLine X-), the brown bar is the first character after a line-initial [font=Eva]s[/font] (NewLine sX-).

Filename: _histo1.jpg Size: 48.85 KB 28-09-2019, 08:38 AM

It seems that o- and a- are frequent after [font=Eva]s[/font]+imageBreak, with ch-, d-, s- also being more frequent in this case than on average (blue vs yellow bars). Due to the limited size of the data-set, the actual occurrences of [font=Eva]s[/font]<->a that make up 8% of the cases are only 2.

The graph is computed on the corpus of pages with at least an image-break (results on the whole manuscript will be slightly different).

(26-09-2019, 07:31 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Likewise, it would be interesting to know the second glyph in words immediately after mid-breaks which start with [y]. Both [yk] and [yt] are line start patterns.

Filename: histo2.jpg Size: 50.73 KB 28-09-2019, 08:08 AM

yt- and yk- are very frequent after an image break. On the other hand, ych- is common at line start, but after <-> 'ch' appears with no additional prefix. A major difference from line-start patterns is the almost total absence of gallows immediately after <-> (see also the previous graph). Another huge difference it the behaviour of qo-: maybe these image breaks can help us understand more of this elusive prefix?

These results appear to be somehow mixed to me: there could be an overlap with line-start patterns, but there also are differences.

RE: Lines interrupted by drawings - Helmut Winkler - 28-09-2019

@ Marco

I was not familiar with that page, thank you for mentioning it! Bunn credits the idea to Jim Reeds. Personally, I don't find it very convincing, though everything is possible of course

I think you are right there

I think ind the Yale ms. someone misread h[erba] reuma as greuma and inserted another herba, there is without doubt a u, someone added two i-strokes

RE: Lines interrupted by drawings - RobGea - 30-09-2019

Is this an image-break ? The ligature appears to be split. ( You are not allowed to view links. Register or Login to view. [ olcthr ] )
You are not allowed to view links. Register or Login to view.

Bonus: text on top of image.

RE: Lines interrupted by drawings - MarcoP - 30-09-2019

Thank you, RobGea! As always with the VMS, there are cases that are particularly difficult to classify. The word at the top is an ordinary image-break, but for the detail that the word on the right appears isolated, with no other words above or below.
The one at the bottom is rendered as a single word in ZL, but I agree that there could be a split.

In general, the page is particularly rich in text/image overlaps. Thank you for pointing it out!

RE: Lines interrupted by drawings - bi3mw - 30-09-2019

Thanks @RobGea, I thought it was never written on parts of the illustrations. Your example proves that this assumption was wrong. I'm really happy about that.

RE: Lines interrupted by drawings - -JKP- - 01-10-2019

You are not allowed to view links. Register or Login to view.

RE: Lines interrupted by drawings - MarcoP - 05-10-2019

Following Emma's observations, I have computed a couple of slightly more accurate bigram diagrams. The main differences are:
Benches were changed into C and S and benched-gallows into K T P F;
* I computed bigram prefixes and suffixes considering the three disjoint sets of image-breaks, line-breaks and "ordinary" space-breaks;
* Here line-breaks do not include what appear to be paragraph-breaks (which appear to be another different set because of Grove words).

I still only considered data from pages that contain at least a line interrupted by illustrations.

The following diagram illustrates suffixes before the various breaks. Those before and image-break are comparable with suffixes before a line-break: the red and green bars are similar, while the blue bar often stands out. As already discussed, the only major difference is -am, which has a spike at line-end but not before an image-break.

Filename: _SUFF2.jpg Size: 37.54 KB 05-10-2019, 04:35 PM

Prefixes after the various breaks are much harder to compare. For instance, che- and she- (Ce- and Se- in the graph) behave differently in the three cases; yt- yk- are another, total different, example.
PS: in term of Emma's You are not allowed to view links. Register or Login to view., one can observe that words after an image break typically favour a weak initial (e.g. cho- ok- yk- yt-) symmetrically "disliking" a strong initial (qo- and the almost total absence of initial gallows). However, there are exceptions to this hypothetical pattern: in particular da-.

Filename: _PREF2.jpg Size: 38.24 KB 05-10-2019, 04:35 PM

RE: Lines interrupted by drawings - Emma May Smith - 05-10-2019

Marco, while you're right that words starting [che, she] or [yk, yt] behave differently after mid-breaks, they still have the same preference. Both [che, she] are less common after mid-breaks and after line breaks, while both [yk, yt] are more common. They're simply not responding to the same degree, though they are showing the same preference. Naturally, it depends on the statistics whether the smaller changes are robust enough to draw evidence from.

I, too, find the [qo] statistics interesting here. Note that [qo] is much lower after mid-breaks while [ot] is much higher. I know we've discussed in the past the relation between [t] and [k], especially in the presence of [q].

Interestingly, [sa] is more common after both line breaks and mid-breaks, yet words starting [a] are only missing after line breaks and not mid-breaks. I had thought that [sa] was replacing [a] at the start of lines, yet clearly must rethink this idea.

RE: Lines interrupted by drawings - MarcoP - 06-10-2019

(05-10-2019, 06:53 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Marco, while you're right that words starting [che, she] or [yk, yt] behave differently after mid-breaks, they still have the same preference. Both [che, she] are less common after mid-breaks and after line breaks, while both [yk, yt] are more common. They're simply not responding to the same degree, though they are showing the same preference. Naturally, it depends on the statistics whether the smaller changes are robust enough to draw evidence from.

Hi Emma, thank you for your comment!
I agree with your observations. In general, the direction of change with respect to space-breaks is the same for image and line-breaks. Among prefixes, qo- and cho- may be the most notable exceptions.

(05-10-2019, 06:53 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.I, too, find the [qo] statistics interesting here. Note that [qo] is much lower after mid-breaks while [ot] is much higher. I know we've discussed in the past the relation between [t] and [k], especially in the presence of [q].

I do hope that these comparisons can help us understand more about qo-. As you say, ot- is much higher for image-breaks than for space-breaks; ok- is also higher: the two differences add up to about 4%, while the qo- difference is 11%. It seems possible that also yt- and yk- (8% difference total) are involved in what is happening here.

(05-10-2019, 06:53 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.Interestingly, [sa] is more common after both line breaks and mid-breaks, yet words starting [a] are only missing after line breaks and not mid-breaks. I had thought that [sa] was replacing [a] at the start of lines, yet clearly must rethink this idea.

If I understand correctly, your idea seems to work for line-breaks, but not for image breaks. I find it difficult to understand if one should conclude that the idea is not correct, or that line-breaks and image-breaks are significantly different phenomena. Even if directions of change are largely consistent, the prefix histogram seems to me to suggest that the second option is worth considering.

I collected some more data, this time focussing on last-first combinations. Also in this case, I only considered pages containing at least one image-break.

Filename: _LAST_FIRST_PERC2.jpg Size: 33.86 KB 06-10-2019, 12:22 PM

The graph illustrates last-first character combinations across the different kinds of break. For instance, -y.o- is based on a word ending by -y followed by a word starting by o- (e.g. qopchy.otshol). The combinations included are those that are expected to be more frequent across image-breaks: since the set of image-breaks is limited, measuring combinations that are expected to be small is difficult. The total number of couples in each set is:

space-break 11050
line-break 1868
image-break 751

The plotted measure is the difference between the actual number of occurrences for a combination and the expected count (based on the frequency of the suffix and prefix), normalized by the total number of couples in each set. When the suffix of a word and the prefix of the following word are perfectly independent, the measure is 0. From the graph, it is clear that space-breaks produce word couples that differ from the expected much more than the other two cases.
If one adds up the absolute values of the normalized differences, the resulting totals for these combinations are:

space-break 18.0%
line-break 7.1%
image-break 7.6%

Considering that line-breaks and image-breaks are about one order of magnitude fewer than space-breaks, there could be a greater influence of accidental / random deviation in the smaller sets: yet these sets appear to deviate much less than space-separated couples.
In other words, when word A is separated from word B by a space, there is some kind of "interference" between the two words which results in predictable suffix.prefix combinations (the phenomenon we discussed in You are not allowed to view links. Register or Login to view.). This kind of "interference" does not happen, or is much weaker, across line-breaks and image-breaks.

In the paper, we also discuss how last-first dependence is stronger in Currier B than in Currier A (as already noted by Currier himself). I am now thinking of comparing image-breaks with Currier A, in order to see if any strong parallel can be found: the relative rarity of EVA:q is another trait in common between the two sets.

RE: Lines interrupted by drawings - MarcoP - 24-10-2019

I have tried separately processing data for Currier A and B. Again, I only considered pages that include at least an image-break. The pages are almost equally split between the two languages, so both sets are large enough to be meaningful. I cannot see anything particularly illuminating in the new graphs.

Filename: suff.jpg Size: 58.81 KB 24-10-2019, 05:20 PM

As already observed, with the exception of -am, suffixes before an image break behave similarly to those before a line-break. In B, -dy is also different, being considerably more frequent before an image-break than a line-break: but nothing as dramatic as what happens with prefixes.

Filename: pref.jpg Size: 61.62 KB 24-10-2019, 05:20 PM

The drop in q- after an image-break remains the most striking feature, and it is stronger in Currier A. In Currier B, several other prefixes behave differently after an image-break or a line-break: e.g. ok-, yk-, she-, sa-, so-.

Tentatively, one could say that an image break affects the suffix of the preceding word like line-breaks do. But the prefix on the other side of the break behaves differently. This seems to point out that the phenomenon is not entirely the same in the two cases, but it is not independent either. For instance, it seems that one can exclude that words are typically split into two halves in one of the two cases but not in the other: this would not explain why suffixes before the two breaks are so similar.