quimqu > 3 hours ago
(5 hours ago)nablator Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 12:55 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.So I started wondering if the opposite approach might actually be more informative.
Or you could use a balanced approach where rare terms are upweighted to reflect their relative importance.
Quote:In addition to creating the A matrix as described, which uses straight TF values, two weighting schemes are also employed to modify the values contained in A. The two schemes applied are Term Frequency-Inverse Document Frequency (TF-IDF) and Log-Entropy (LE).You are not allowed to view links. Register or Login to view.
quimqu > 3 hours ago
(4 hours ago)Dunsel Wrote: You are not allowed to view links. Register or Login to view.If I strip the gallows from those hapax tokens, that decline flattens out.
Dunsel > 3 hours ago
(3 hours ago)quimqu Wrote: You are not allowed to view links. Register or Login to view.My own objective was a bit different anyway. I was trying to see whether low-frequency tokens create bridges between folios and sections. That is why I did not use hapax legomena initially: a single occurrence cannot really create network structure or repeated connections between pages.
quimqu > 3 hours ago
(3 hours ago)Dunsel Wrote: You are not allowed to view links. Register or Login to view.(3 hours ago)quimqu Wrote: You are not allowed to view links. Register or Login to view.My own objective was a bit different anyway. I was trying to see whether low-frequency tokens create bridges between folios and sections. That is why I did not use hapax legomena initially: a single occurrence cannot really create network structure or repeated connections between pages.
But, if one scribe creates a hapax in their work and then that word gets used or copied by another scribe, that creates a network between scribes. And keep in mind there's more than just a few folios in herbal that are not currier A (Scribe 1). If you're testing all of herbal then you're mixing two regimes and will get a false network connecting to other scribe 2 pages in other sections. Same for Scribe 1 and 3 in Pharma.
Dunsel > 2 hours ago
(3 hours ago)quimqu Wrote: You are not allowed to view links. Register or Login to view.I did a quick graph of how the rare tokens are distributed by writting hand (according to EVA transliteration). There are plenty of connections, meaning that the use of the rare tokens is transversal between scribas.