quimqu > Yesterday, 11:05 AM
quimqu > Yesterday, 04:18 PM
Jorge_Stolfi > Yesterday, 04:42 PM
(Yesterday, 11:05 AM)quimqu Wrote: You are not allowed to view links. Register or Login to view.I tested both types of artificial line breaking.
Quote:For natural text, I now compute the token-level statistic on the running text itself, without using the artificial lines in the calculation. The artificial lines are used only afterwards, to assign each token to a positional bin such as first, second, middle, penultimate, last. In other words, the line cut no longer changes the token’s measured value. It only changes the group where that value is displayed.
For the Voynich, I do the analogous thing. I reconstruct each real paragraph as one flat token sequence, compute the token-level statistic on that flat paragraph, and only then project each token back to its original line position and line type.
Quote:I have also the plots divided by section, if anyone finds it interesting
quimqu > Yesterday, 04:54 PM
(Yesterday, 04:42 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.To fix that problem, I would discard the first line of the original paragraph, then re-justify the remaining tokens with new page width (in characters, of course).
Jorge_Stolfi > Yesterday, 06:29 PM
(Yesterday, 04:54 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.some final lines have a single word at their most right end of it. Is there any consensus about that use?
quimqu > Yesterday, 07:06 PM
(Yesterday, 06:29 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 04:54 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.some final lines have a single word at their most right end of it. Is there any consensus about that use?
In what section?
I know of f1r. But page f1r should not considered Herbal. Like f66r, f85r1, and other isolated pages without figures, it is best place in an "unknown" section of its own.
I see a right-justified tail line on You are not allowed to view links. Register or Login to view. and f42v. Are there any others?
In the herbal section,I see centered tail lines on f9r, f18r, f22v, f24r, f27r, f31r, f40v, You are not allowed to view links. Register or Login to view. (2x). I don't know what those are. My best guess is just the Scribe trying to make text look pretty. Since the previous line is always full, they are probably just tail lines. I don't think it will make much difference for the statistics if you include those lines or discard them.
There are four centered or right-justified in the Starred Parags section. Three seem to be section titles, and one seems to be a case of the Scribe skipping part of a line and then trying to insert it in above the previous line.
All the best, --stolfi
ReneZ > Yesterday, 07:06 PM
(Yesterday, 04:54 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.By the way: some final lines have a single word at their most right end of it. Is there any consensus about that use?
quimqu > Yesterday, 07:31 PM
(Yesterday, 07:06 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 04:54 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.By the way: some final lines have a single word at their most right end of it. Is there any consensus about that use?
Each time someone mentions "consensus" I wonder how that would matter?
Anyway, these cases are clearly identified in all files using the IVTFF format.
nablator > Yesterday, 09:39 PM
(Yesterday, 07:31 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.I meant if it is known in other manuscripts
quimqu > Today, 08:21 AM