MichelleL11 > 04-05-2022, 04:19 PM
(04-05-2022, 01:32 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.(04-05-2022, 12:22 PM)Bernd Wrote: You are not allowed to view links. Register or Login to view.Koen, do any of the latin texts you analyzed contain scribal abbreviations e.g. symbols for -us, -um, -bus, et?One thing I can try is take a normalized Latin text and introduce abbreviation symbols by replacing certain letter groups with numerals.
1 = con, com, cun, cum
2 = tur, ur
3 = us, os
4 = ris, tis, cis
Doing this will remove some information from the text, because when we now see "4", we must guess from context whether it represents ris, cis or tis. Therefore, we could hypothesize that some entropy stat will be reduced. However, they are all increased.
h0: 4.64 -> 4.86
h1: 4.01 -> 4.15
h2: 3.31 -> 3.38
It was to be expected that h1 would increase, since we introduce several new, frequent symbols.
H2 increases as well, probably in part because the non-abbreviated parts of the Latin text still behave like normal. Moreover, abbreviation condenses the text, which is also likely to increase h2.
Koen G > 04-05-2022, 04:44 PM
ReneZ > 04-05-2022, 06:00 PM
Koen G > 04-05-2022, 06:26 PM
(04-05-2022, 06:00 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.It is really just a simple matter of information content divided by text length.
Bernd > 05-05-2022, 11:27 AM
(04-05-2022, 01:32 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Generally speaking, abbreviation symbols will take a text's entropy statistics further away from the VM, so this is not something I am concerned about.Thank you, that's to be expected.
One thing I can try is take a normalized Latin text and introduce abbreviation symbols by replacing certain letter groups with numerals.
1 = con, com, cun, cum
2 = tur, ur
3 = us, os
4 = ris, tis, cis
Doing this will remove some information from the text, because when we now see "4", we must guess from context whether it represents ris, cis or tis. Therefore, we could hypothesize that some entropy stat will be reduced. However, they are all increased.
h0: 4.64 -> 4.86
h1: 4.01 -> 4.15
h2: 3.31 -> 3.38
It was to be expected that h1 would increase, since we introduce several new, frequent symbols.
H2 increases as well, probably in part because the non-abbreviated parts of the Latin text still behave like normal. Moreover, abbreviation condenses the text, which is also likely to increase h2.
cvetkakocj@rogers.com > 05-05-2022, 05:17 PM
(04-05-2022, 06:00 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I'm not sure how to put it differently:
Abbreviations increase entropy.
Verbosity decreases entropy.
It is really just a simple matter of information content divided by text length.
ReneZ > 05-05-2022, 06:11 PM
(04-05-2022, 06:26 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.(04-05-2022, 06:00 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.It is really just a simple matter of information content divided by text length.
In practice, abbreviation in manuscripts often destroys information though (the same symbol replacing various strings).
Searcher > 07-05-2022, 02:38 PM
Koen G > 20-05-2022, 10:45 AM
(20-05-2022, 09:34 AM)Searcher Wrote: You are not allowed to view links. Register or Login to view.But if applying transformations - at the beginning and then - removing spaces?
Bernd > 20-05-2022, 02:22 PM