23-04-2026, 09:32 AM
@ dashstofsk
Yes, those are exactly the statistical effects I mentioned earlier as a potential problem - thanks for pointing that out. I'm always glad when someone flag these, because such potential errors can cause a lot of unnecessary work.
Here's how I'd propose we calculate it: we divide the pages into three groups - "p"-poor pages, pages with a moderate distribution of "p", and pages where "p" occurs frequently. Then we compute the respective effects within each group for P-starters and non-P-starters, and check whether the P-starter effect is homogeneous. If it varies systematically with p-density, we've identified a confound rather than a genuine phenomenon; if it holds up consistently, the effect is robust. Would this approach be acceptable to you?
Yes, those are exactly the statistical effects I mentioned earlier as a potential problem - thanks for pointing that out. I'm always glad when someone flag these, because such potential errors can cause a lot of unnecessary work.

Here's how I'd propose we calculate it: we divide the pages into three groups - "p"-poor pages, pages with a moderate distribution of "p", and pages where "p" occurs frequently. Then we compute the respective effects within each group for P-starters and non-P-starters, and check whether the P-starter effect is homogeneous. If it varies systematically with p-density, we've identified a confound rather than a genuine phenomenon; if it holds up consistently, the effect is robust. Would this approach be acceptable to you?