zachary.kaelan > 28-01-2026, 11:27 PM
Jorge_Stolfi > 29-01-2026, 07:22 AM
(28-01-2026, 11:27 PM)zachary.kaelan Wrote: You are not allowed to view links. Register or Login to view.common thing in medieval times was to make lines very cleanly justified by any means necessary,
Quote:I hypothesized that this would mess with statistics around drawings and line breaks. ... Words at line end and before text intrusions are shorter on average, by around 0.3 glyphs and 0.5 glyphs, respectively.
Quote:The letter "e" is significantly less common in words around breaks ... The letter "s" is about 3 times as common in words at the start of lines and both before and after drawings, and 63% more common at the end of lines ...
Quote:There are also a lot of short words that seem to almost exclusively appear in these positions, like "sy", "oly", "oldy", "dy", "oky", "ldy", "ary", "lol", etc. My guess is that these words are either abbreviations or nonsense to pad for length.
Quote:This has been You are not allowed to view links. Register or Login to view., but words at the start of lines often have a "d" or "y" for padding if the words starts with "ch" or "sh", creating a lot of words that almost exclusively appear at line start.
Quote:The word "sho" being a strange exception, and also a word that appears primarily at the start of lines.
zachary.kaelan > 29-01-2026, 10:19 PM
(29-01-2026, 07:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Thus, whenever possible, compute and study the frequency of words rather than glyphs.
(29-01-2026, 07:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Many lines end in am, and m seems less common at other places along the line.
(29-01-2026, 07:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The trivial line-breaking algorithm is: if there is enough space on the current line for the next word, write it and continue. If not, break the line there and start a new line at the left rail.
People don't seem to realize, but this banal algorithm results in the first word of each line being longer than average, with the last 1-3 words of each line being shorter than average.
![[Image: 30fJpNI.png]](https://i.imgur.com/30fJpNI.png)
![[Image: T3raHD7.png]](https://i.imgur.com/T3raHD7.png)
(29-01-2026, 07:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.These anomalies could be due to the letter e being used mostly in medium-length words like cheedy and used less in both longer and shorter words.
(29-01-2026, 07:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.It makes no sense to have padding at the start of a line. It may be simply that many long words start with y or d.
Jorge_Stolfi > 29-01-2026, 11:30 PM
(29-01-2026, 10:19 PM)zachary.kaelan Wrote: You are not allowed to view links. Register or Login to view.Despite the word "aiin" occurring 529 times in the text, it never occurs at the start of a line, similar with "ain".
Quote:The words "daiin" and "dain" occur about twice as often than expected at the start of lines.
Quote: on some folios like You are not allowed to view links. Register or Login to view. there's definitely a bunch of what objectively looks like filler:
![]()
Quote:Maybe it's genuinely a stupid way to pad for length.
Quote:Maybe it's shorthand to mark the start of sentences.
Quote:Maybe it's something arbitrarily decided on for reasons we'll never know.
dashstofsk > 30-01-2026, 10:32 AM
(28-01-2026, 11:27 PM)zachary.kaelan Wrote: You are not allowed to view links. Register or Login to view.The letter "s" is about 3 times as common in words at the start of lines and both before and after drawings, and 63% more common at the end of lines.
dashstofsk > 30-01-2026, 11:11 AM
Jorge_Stolfi > 30-01-2026, 11:14 AM
(30-01-2026, 10:32 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.Also, look at the spline transfer plots of the line positions of words starting s. In both A and B words peak at the start of lines, but in A they rise in frequency towards the end.
dashstofsk > 30-01-2026, 11:40 AM
(28-01-2026, 11:27 PM)zachary.kaelan Wrote: You are not allowed to view links. Register or Login to view. "d" is also usable in place of "s" for most of the words where it seems to be used as padding.
dashstofsk > 30-01-2026, 11:56 AM
Jorge_Stolfi > 30-01-2026, 12:49 PM
(30-01-2026, 11:11 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.But also here is a curiosity about words starting with s. The parts of the words that follow seem themselves to be genuine words that appear frequently. This seems to suggest that initial s is something of a nonsense character, that the writer will often start a line by putting down s and then continuing with another word.