bi3mw > 15-11-2025, 01:22 AM
(15-11-2025, 12:32 AM)Bernd Wrote: You are not allowed to view links. Register or Login to view.On a more serious note - I wonder how a meter could explain downwardness beyond a few lines.

tavie > 15-11-2025, 02:39 AM
(14-11-2025, 02:24 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I saw @tavie's presentation at the last Voynich day,but that was a lot of detail, and included speculation about the head lines...
(14-11-2025, 03:21 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Let me also remind people of the recent finding that the line-breaking algorithm used by scribes (not just on the VMS, but in any language and any epoch, even today) has the side effect of making the first word of each line longer than average, and the last few words shorter than average. This phenomenon alone can have a significant effect on word frequencies at the start of the line...
R. Sale > 15-11-2025, 05:04 AM
Jorge_Stolfi > 15-11-2025, 06:15 AM
(15-11-2025, 02:39 AM)tavie Wrote: You are not allowed to view links. Register or Login to view.I thought this was a theory proposed by Elmar Vogt and Ger Hungerink in 2012...but I didn't think it was proven as an explanation.
Quote:But I had the impression only printed works were counted (is that wrong?), and it would be more interesting to see assessments of manuscripts, especially where longer words could in theory squeeze in at the end of the line by being abbreviated.
Quote:It would also be interesting to separate out paragraph start words (we see some really long ones, as well as short ones like "pol") from the line start pack since they are almost certainly not being wrapped round.
Quote:Another point I'd make is that while the word wrap concept of longer words not fitting at line end is a potential explanation for the word length discrepancies, I don't see it as being a comprehensive potential explanation for the glyph discrepancies. It might have an impact, but I don't see the evidence for it being the primary explanatory factor for why word types at line start are different.
Quote:We would expect the word types or glyph clusters "missing" at line end to be excessively popular at line start. In many cases, we do not see this. Something else is going on.
pfeaster > 18-11-2025, 11:34 AM
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Wow, Patrick's paper is quite a big meal to digest.
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
1. Remove the head lines of parags. Among other things, that line is likely to have special contents (like plant names and aliases), which could well imply different word frequencies and positional patterns, and hence the same for characters and digraphs.
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.One problem to watch for here is that parag breaks are sometimes not obvious. In the Stars section, in particular, I suspect that there there is a run of 5-6 parags that were joined by the Scribe (a newbie?) into a single parag, before he returned to the normal format.
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.2. Limit the analysis to just one of the sections with substantial running text -- Herbal-A, Herbal-B, Bio, and Stars. If the anomalies are real, they should be noticeable, and probably even stronger, in one of those sections. If they turn out to be absent or different in other sections, that by itself would be important information.
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.3. Try to identify the words that are responsible for the anomalies. Maybe I have misread the tables in the paper, but among the Sh/Ch word pairs, some seem to have greater positional bias than others.
(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Could it be, for example, that the most leftward member of the pair often occurs after a long word, while the other member more often occurs after a short one? That could perhaps explain the positional anomaly as a consequence of the line-breaking word-length bias.
Or maybe the two members of the pair can get fused or split at different rates in the transcription. So that that some of the Sheols are actually Sheoldy while most Cheols are indeed Cheols. I can't think how this possible confounding factor could be addressed. Although this may be one case
Kaybo > 24-11-2025, 08:41 AM
(18-11-2025, 11:34 AM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.(15-11-2025, 12:21 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Wow, Patrick's paper is quite a big meal to digest.
I'd recommend the You are not allowed to view links. Register or Login to view. over the earlier/longer blog post.
dashstofsk > 24-11-2025, 02:39 PM
(15-11-2025, 06:15 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Quote:Another point I'd make is that while the word wrap concept of longer words not fitting at line end is a potential explanation for the word length discrepancies, I don't see it as being a comprehensive potential explanation for the glyph discrepancies. It might have an impact, but I don't see the evidence for it being the primary explanatory factor for why word types at line start are different.If the tokens at line start are longer than average, longer word types must have higher frequencies in that position than elsewhere, and the opposite must be true for shorter word types. For instance, the frequency of qokeedy should be higher at line start than at mid-line, while the opposite should be true for ar. But indeed it would be important to verify and quantify these differences for the VMS.
Quote:We would expect the word types or glyph clusters "missing" at line end to be excessively popular at line start. In many cases, we do not see this. Something else is going on.Indeed, it is not certain that these differences in word type frequencies will cause differences in character frequencies, but is certainly possible. And even if the line breaking length bias turns out to be insufficient to explain the line-start anomalies, we would have to subtract its effects in order to understand the real anomalies and infer their causes.