22-08-2025, 09:28 AM
(22-08-2025, 08:42 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(22-08-2025, 07:53 AM)quimqu Wrote: You are not allowed to view links. Register or Login to view.The null hypothesisis is:
H₀: topics are independent of language/hand.
I'm not sure this null hypothesis is valid, at least not when it's used for the languages. Since languages were initially defined using the properties of the text (relative abundance of various words and symbol combinations), and topic modeling uses the same properties, the null hypothesis states an a priori impossible situation, so its p value might not have any sense. I'm not a scientist though and my experience with p-values is nearly zero.
For hands this is more interesting, but as far as I know, some correlation between hands and languages do exist? This is not really my area.
You’re right that the null hypothesis is problematic for languages, because both languages and topics are defined from the same textual features (word frequencies, symbol patterns,..). That means independence is impossible by construction, so I agree that the p-value wouldn’t really have a valid interpretation.
For the scribal hands, it’s different: hands are identified from handwriting features, not from lexical distributions (if I am not wrong). So here the null hypothesis (that topics are independent of hands) makes sense, even if some correlation between hands and languages is already known.