Distinct patterns at the very start of lines in the VMS
quimqu > 4 hours ago
I’ve been looking at the text with simple first-order Markov models (just transitions between consecutive characters), and one thing that stands out is the behavior of the very first bigram in a line, meaning the transition from the first to the second character of the first word.
That initial step is much more dependent on the line type (paragraph-initial, paragraph-internal, paragraph-final, or lines outside paragraphs) than anything that comes after. Once you move further into the word, the distributions of character transitions look much more alike across types.
So it seems that line beginnings follow their own set of positional rules, much stricter than the transitions inside words. The line body is comparatively homogeneous, while the margins (start and end) show stronger constraints.
(Checking data)