Mauro > 21-06-2025, 12:45 PM
(21-06-2025, 11:23 AM)Rafal Wrote: You are not allowed to view links. Register or Login to view.Yes, precise defining of "weird" words may be difficult.
But personally I am not very fond of looking for perfect definitions because it may put you into total block.
dashstofsk > 21-06-2025, 12:54 PM
nablator > 21-06-2025, 04:47 PM
Mark Knowles > 21-06-2025, 08:23 PM
Mark Knowles > 21-06-2025, 08:57 PM
Mauro > 21-06-2025, 10:07 PM
(21-06-2025, 08:23 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.As I have stated elsewhere I am inclined to the view that the "weird" or abnormal words most likely constitute the true text of the manuscript.
Mark Knowles > 22-06-2025, 05:34 AM
(21-06-2025, 10:07 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.Considering only the 'normal' (=~ more frequent) words, instead, largely avoids this problem, but there is no obvious break in the frequency data to suggest where the cutoff should be set, resulting as a minimum in a high degree of arbitrariness (and of course, if @Mark Knowles's hypothesis above is the right one, studying the normal words would be a waste of time).Studying the normal words would only be useful in so far as it helps to better identify what are normal "filler" words and what are real words. I would guess that around 20% of words are real words whilst the remaining 80% of words are just filler. This is just a guess, however I would be surprised if less than 10% are real words or alternatively more than 30% are real words.
Aga Tentakulus > 22-06-2025, 08:22 AM
dashstofsk > 22-06-2025, 11:27 AM
(22-06-2025, 05:34 AM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.around 20% of words are real words
Mark Knowles > 22-06-2025, 12:51 PM
(22-06-2025, 11:27 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.(22-06-2025, 05:34 AM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.around 20% of words are real words
You are piling up the improbabilities and suggesting that the manuscript is some horrendous mixture of valid text, fabrication, encypherment. This is too complicated. Consider it from the perspective of the authors. In addition to the task of having to write a thing of ~225 pages and ~36000 words using strange letters would they really have wanted to make things even more difficult by jumping between fabrication and encypherment? No, they would have wanted simplicity. Also afterwards they themselves would have found the manuscript difficult to read.
But did I read correctly that you think that 80% of the text might be bogus filler words? Why would they have wanted to waste so much parchment on meaningless words? But also look at the HerbalA1 pages. 95 pages of 8086 words gives an average of 85 words to a page. And then only 20% of those are real words, making 17. That is hardly enough to say anything of value about the plants the text is supposed to chronicle. All this pother just just to hide the meaning of those 17 words?
Somehow it just doesn’t seem very clever of the authors.