asteckley > 26-04-2024, 09:48 PM
bi3mw > 26-04-2024, 11:13 PM
Quote:.....
This implies they must be due to some underlying causal mechanism that is related to the token’s position, although the analysis itself cannot reveal what that mechanism is.
asteckley > 26-04-2024, 11:44 PM
(26-04-2024, 11:13 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Is it correct that, if there is an underlying mechanism, it suggests the generation of text rather than content-related text ? In other words, shouldn't a text consisting only of meaningful content be free of the characteristics described above ?
kckluge > 27-04-2024, 08:39 AM
kckluge > 28-04-2024, 06:04 AM
(27-04-2024, 08:39 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.This is a really nice piece of work, congratulations on it.
I had done some similar word length distribution analysis on the Bio section ('cause also a single dialect and scribe) that I never wrote up for the forum. It looked at the N-th-but-not-last and last words on lines, although I didn't get around to doing chi^2 values. Because the length distributions are long-tailed to the right, I used the mode rather than the mean for the "average" length -- summary of the results was:
Types: mode at L=5 for N <= 8 and last word on line, at L = 4 for N = 9 & 10 (relatively few lines have more than 11 words)
Tokens: for N = 1 modes at L = 3 and L = 5; for N = 2 to 4 mode at L = 5; for N = 5 to 10 mode at L = 4; last word on line mode at L = 3
Karl
asteckley > 28-04-2024, 04:58 PM
(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I want to push back slightly on the use of the term "intent" as implying that the scribes are choosing shorter words to make them fit before the drawing element or for some reason achieve a given line length. If one thinks (as I do) that spaces are inserted in some algorithmic way and that the algorithm operates at the level of lines (or interrupted lines), then last words will tend to be shorter for no other reason than because they're the left-over bit of text at the end.
asteckley > 29-04-2024, 05:27 AM
(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.I think the importance of last words on lines being shorter on average needs to be emphasized, because any theory regarding how the text was generated needs to explain it. Consider something like Rugg's proposed grill method -- I'm not sure why there would be anything special about last words on a line if that was the mechanism.
(28-04-2024, 06:04 AM)kckluge Wrote: You are not allowed to view links. Register or Login to view.If it's really the case that average word length shifts based on word position in the line (correcting for the effect of the result regarding words-before-drawing-elements from asteckley's paper), that's potentially a whole other world of hurt for everybody's hypotheses.That is true. I know those the statistical tests are robust and reliable and based on solid assumptions, and I know that the "data is the data" and the results have to be accepted for what they say. Nevertheless, even I still find them surprising and hard to accept. Why should the drawings have any such effects on the token choices?
pfeaster > 29-04-2024, 02:54 PM
asteckley > 29-04-2024, 03:34 PM
(29-04-2024, 02:54 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Interesting! See also Marco's study of textual patterning around image intrusions here:
You are not allowed to view links. Register or Login to view.
asteckley > 30-04-2024, 02:21 AM