quimqu > 12-11-2025, 11:18 PM
bi3mw > 12-11-2025, 11:30 PM
Kaybo > 13-11-2025, 12:55 AM
quimqu > 13-11-2025, 11:44 AM
(12-11-2025, 11:30 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.@quimqu: Although it's outside the scope of your current research, can you say anything about the (average) lengths of the individual word parts ?
bi3mw > 13-11-2025, 12:01 PM
(13-11-2025, 11:44 AM)quimqu Wrote: You are not allowed to view links. Register or Login to view.Hello, I think this is what you are asking for. Isn't it?
quimqu > 13-11-2025, 01:47 PM
(13-11-2025, 12:01 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Yes, that's exactly what I meant. Thank you. What I don't quite understand is the uneven distribution (counts) of the individual word segments. Is the number of stems really that low ?
quimqu > 13-11-2025, 02:05 PM
(13-11-2025, 12:55 AM)Kaybo Wrote: You are not allowed to view links. Register or Login to view.Some suffix are predominant in some parts in the manuscript. Like "edy" is missing in the first 25 folios than it is used in folio 26 very often. Then its starts to change from folio to folio, but mostly there is heavy use or no use of this suffix.
Has nothing to do with line as a function, but I want to mention it here, because this imbalance has also an impact to all statistical analysis.
bi3mw > 13-11-2025, 02:33 PM
(13-11-2025, 01:47 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.The number of stems is very small because the classification rules strongly favor prefix and suffix behavior. A segment is only marked as a stem when it appears mostly in the middle of words and also shows enough contextual variety on both sides.
(13-11-2025, 02:05 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.Hello, thank you. This is right, but the algorithm does not take "edy" as a suffix:
Kaybo > 13-11-2025, 11:40 PM
(13-11-2025, 02:05 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.(13-11-2025, 12:55 AM)Kaybo Wrote: You are not allowed to view links. Register or Login to view.Some suffix are predominant in some parts in the manuscript. Like "edy" is missing in the first 25 folios than it is used in folio 26 very often. Then its starts to change from folio to folio, but mostly there is heavy use or no use of this suffix.
Has nothing to do with line as a function, but I want to mention it here, because this imbalance has also an impact to all statistical analysis.
Hello, thank you. This is right, but the algorithm does not take "edy" as a suffix:
=== SUFFIX_BASES (pass 2) ===
['dy', 'in', 'ey', 'hy', 'ry', 'aly', 'ees', 'om', 'oly', 'es', 'an', 'ly', 'oy', 'ed', 'ho', 'as', 'im', 'eor', 'eol', 'py']
This is the distribution of prefixes and sufixes per folio and side (note the peak at You are not allowed to view links. Register or Login to view. but due to the only 3 words in the page):
bi3mw > 13-11-2025, 11:48 PM
(13-11-2025, 11:40 PM)Kaybo Wrote: You are not allowed to view links. Register or Login to view.In how many different words we find "edy"