quimqu > 18-04-2026, 11:32 PM
(18-04-2026, 01:36 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The Scribe had a handful of abbreviations that he could use to help avoid bad breaks or rail overflow. Changing iin to m was one.
Jorge_Stolfi > 19-04-2026, 01:03 AM
(18-04-2026, 11:32 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.So overall, expanding m to iindoes not make the structure more visible in these models, and tends to weaken it.
quimqu > 19-04-2026, 06:29 PM
(19-04-2026, 01:03 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The test I was thinking was: in each parag,
- Replace m by iin everywhere.
- Join lines into a single token stream.
- Feed that stream to the SLA with different rail width W.
- Measure the anomalies around the new line breaks.
- Compare with the anomalies seen in the original text.
Jorge_Stolfi > Yesterday, 01:18 AM
(19-04-2026, 06:29 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.What I get is that the real line breaks are still much stronger than the new ones in all cases, which is fine.
Quote:But the important part is that the SLA does not come out stronger than the plain TLA. On average it is actually slightly worse.
quimqu > Yesterday, 07:41 AM
(Yesterday, 01:18 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I suppose you are using a neural network to measure the anomalies, so you cannot tell what exactly it is measuring?
nablator > Yesterday, 10:19 AM
(Yesterday, 07:41 AM)quimqu Wrote: You are not allowed to view links. Register or Login to view.It’s a simple score.
for _, r in fam_pos.iterrows():
fam = r["family"]
p_end = r["end"] / tot_end if tot_end else 0.0
p_start = r["start"] / tot_start if tot_start else 0.0
p_int = r["interior"] / tot_int if tot_int else 0.0
# Positive value = enriched at line edge relative to interior.
A_end_score[fam] = np.log2(p_end / p_int) if p_end > 0 and p_int > 0 else np.nan
A_start_score[fam] = np.log2(p_start / p_int) if p_start > 0 and p_int > 0 else np.nanJorge_Stolfi > Yesterday, 11:50 AM
(Yesterday, 07:41 AM)quimqu Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 01:18 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I suppose you are using a neural network to measure the anomalies, so you cannot tell what exactly it is measuring?I’m not using a neural net. It’s a simple score. Roughly, I learn what typical line endings look like and what typical line beginnings look like, at the character level. Then for any cut I score how “end-like” the left side is and how “start-like” the right side is, compared to the overall background.
quimqu > Yesterday, 02:50 PM
(Yesterday, 11:50 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Sorry, I was confused by your use of the word "learn". What do you mean by that?
pfeaster > Today, 12:29 AM
(18-04-2026, 01:36 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.Is there significant evidence against claims (1)-(7)? Again, the mere observation of statistical anomalies around line breaks is not a valid argument, unless it can be shown that such anomalies cannot be simply side effects of the SLA.
quimqu > 21 minutes ago