bi3mw > 11-12-2025, 01:58 AM
(11-12-2025, 01:35 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(11-12-2025, 01:14 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.The top 20 chunk pairs
Looking at the list, I guess that this applies to start-end chunks that also have something in between, correct?
Jorge_Stolfi > 11-12-2025, 05:05 AM
nablator > 11-12-2025, 11:33 AM
(11-12-2025, 01:14 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.The top 20 chunk pairs line by line
dashstofsk > 11-12-2025, 02:52 PM
(11-12-2025, 01:14 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.The top 20 chunk pairs line by line
bi3mw > 11-12-2025, 06:01 PM
(11-12-2025, 05:05 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.What is the result of the same analysis applied to English or Latin text (formatted as filled parags)?
All the best, --stolfi
(11-12-2025, 11:33 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.If you want to do statistics on lines you should have only lines in your data, not labels, circular texts, etc.: the "P" lines of the IVTFF transliterations.
Actually "Pt" lines are on the same physical line as the previous paragraph line.
In some cases free-floating words look more like labels than lines to me:
<f68r1.5,@Pb> yky
<f68r1.6,+Pb> dary
<f68r1.7,+Pb> chkchykoly
All single free-floating words should probably be removed:
<f84v.42,@Pb> okar
<f84v.43,+Pb> ydairol
<f84v.44,+Pb> ychckhy
<f84v.45,+Pb> dshedy
<f84v.46,@Pb> okchdy
<f84v.47,+Pb> solchey
<f84v.48,+Pb> dairoldy
<f84v.49,+Pb> darchy
<f84v.50,+Pb> yskhy
<f84v.51,+Pb> ochedy
(11-12-2025, 02:52 PM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.This isn't the way to do it. You need to display the ratio of observed ocurrences to the number that would be expected if the start and end 'chunks' were distributed randomly. If then it appears that there is general parity then it would indicate that there is nothing significant.
Jorge_Stolfi > 11-12-2025, 07:16 PM
(11-12-2025, 06:01 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I'll see where I can get a sufficiently large corpus [of English]
a.zip (Size: 117.96 KB / Downloads: 2)
R. Sale > 11-12-2025, 07:17 PM
Jorge_Stolfi > 11-12-2025, 07:41 PM
(11-12-2025, 07:17 PM)R. Sale Wrote: You are not allowed to view links. Register or Login to view.Seems to me that we already know that 'normal' text does not treat lines as functional units.
tavie > 11-12-2025, 07:56 PM
(11-12-2025, 07:41 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view. The question that remains is how much of the "LAAFU" phenomenon on the VMS can be explained by this effect.
nablator > 11-12-2025, 08:50 PM