RobGea > 03-07-2022, 03:30 PM
Type, total_words, type_vocab, unshared_vocab, unshared_vocab as % of type_vocab, Rank
Ha, 8054, 2516, 1460, % 58.028 R1
Hb, 3522, 1353, 474, % 35.033 R8
S, 10851, 3072, 1662, % 54.101 R2
B, 6376, 1471, 618, % 42.012 R4
P, 2555, 1132, 472, % 41.696 R5
A, 876, 611, 238, % 38.952 R7
Z, 1291, 767, 343, % 44.719 R3
T, 3108, 1279, 448, % 35.027 R9
C, 2213, 1101, 436, % 39.600 R6
Ruby Novacna > 03-07-2022, 05:46 PM
(03-07-2022, 03:30 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.-Stars at R2R3 ?
(03-07-2022, 03:30 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.The Herbal Type was further split into 2 typesShouldn't the text pages also be split between A and B?
Torsten > 03-07-2022, 07:10 PM
(03-07-2022, 03:30 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Observations:
-HerbalA has the most unshared words, as expected because it is CurrierA.
-Pharma is also CurrierA so its position at R5 is unexpected.
-Stars at R2 is 10% higher than the next rank, an anomaly with no obvious explanation.
Type, total_words, type_vocab, unshared_vocab, type_vocab as % of total_words, unshared_vocab as % of total_words
Herbal (A), 8054, 2516, 1460, % 31.2 % 18.2
Pharma (A), 2555, 1132, 472, % 44.3 % 18.5
Astro, 876, 611, 238, % 69.7 % 27.2
Zodiac, 1291, 767, 343, % 59.4 % 26.6
Cosmo, 2213, 1101, 436, % 49.8 % 19.7
Text, 3108, 1279, 448, % 41.2 % 14.4
Herbal (B), 3522, 1353, 474, % 38.4 % 13.5
Stars (B), 10851, 3072, 1662, % 28.3 % 15.3
Bio (B), 6376, 1471, 618, % 23.1 % 9.7
RobGea > 03-07-2022, 08:52 PM
ReneZ > 04-07-2022, 03:07 PM
R. Sale > 04-07-2022, 10:26 PM
Torsten > 05-07-2022, 12:09 AM
(04-07-2022, 10:26 PM)R. Sale Wrote: You are not allowed to view links. Register or Login to view.Can you list the unique vords of Herbal A and determine a frequency of use for those that were used multiple times within that section? In other words, is there a specific set of vords that uniquely define the "topics" of Herbal A.
1262 vords only occur once
107 x two times
28 x three times
12 x four times
8 x five times
1 x six times
1 x eight times
2 x nine times ('dsho', 'cthom')
3 x 13 times ('qotchol', 'cthaiin', 'choiin')
1 x 14 times ('qotchor')
1 x 15 time ('ctho')
581 vords only occur once
38 x two times
6 x three times
4 x four times
1 x five times
1 x six times
2 x seven times ('qoly', 'rshedy')
1 x ten times ('qolchedy')
1496 vords only occur once
120 x two times
27 x three times
12 x four times
4 x five times
2 x 6 times ('chedam', 'oteal')
2 x 8 times ('lkeeey', 'oteeey')
Ruby Novacna > 05-07-2022, 08:12 AM
(03-07-2022, 03:30 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Vocabulary size by Illustration TypeAs I don't usually do statistical calculations, I find it hard to follow: what precise point should this calculation of unshared words clarify? I couldn't find an explanation before the presentation of the results.
Torsten > 05-07-2022, 12:16 PM
(04-07-2022, 10:26 PM)R. Sale Wrote: You are not allowed to view links. Register or Login to view.Can the herbals be combined and then combined with pharma of look for unique common terms not found in other parts of the VMs? If the whole botany and Pharma bit were all about leaves, then that would be a shared term probably not used in cosmic and zodiac parts.
438 x occurs only once
19 x two times
2 x three times ('olchor','shockhey')
1700 x occurs only once
154 x two times
47 x three times
16 x four times
11 x five times
2 x six times (unique for Herbal A + Pharma: 'ctheody')
1 x seven times (unique for Herbal A + Pharma: 'dom')
1 x eight times
2 x nine times
3 x 13 times
1 x 14 times
1 x 15 times
408 x occurs only once
10 x two times
1 x four times ('chekedy')
1670 x occurs only once
146 x two times
35 x three times
17 x four times
9 x five times
1 x six times
2 x eight times (unique for Herbal A + B: 'tchody')
2 x nine times
3 x 13 times
1 x 14 times
1 x 15 times