Mauro > Yesterday, 04:40 PM
Kaybo > Yesterday, 08:30 PM
(Yesterday, 04:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.I imagine the pain St. Thomas Aquinas had to endure when he wrote his Summa Theologica with such a limited vocabulary...
Mauro > Yesterday, 10:08 PM
(Yesterday, 08:30 PM)Kaybo Wrote: You are not allowed to view links. Register or Login to view.(Yesterday, 04:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.I imagine the pain St. Thomas Aquinas had to endure when he wrote his Summa Theologica with such a limited vocabulary...
Just to clarify for myself, the voynich transcript has a lot of different words. More than usually used in Latin? I am mean the whole text has 8000 different words according to forum.
Jorge_Stolfi > 10 hours ago
(Yesterday, 10:08 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.The VMS (*) has 38411 words (**) in total, and 8424 unique words (***), that is to say 1 word type every ~4.56 word tokens. This is just a slightly higher percentage than for instance Caesar's De Bello Gallico (1 word type every ~4.67 word tokens).
Mauro > 7 hours ago
(10 hours ago)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.If a language follows Zipf's law, the token/lexeme ratio in a sample cannot be a constant. As the number N of tokens (word occurrences) in a sample increases, the number M of lexemes (distinct words) grows like K*sqrt(N). More precisely like K*N**b where b is typically between 0.4 and 0.6. This formula is known as You are not allowed to view links. Register or Login to view..
So, when comparing the VMS lexicon size to that of other languages, it is important to use samples with the same number of tokens.
Assuming the exponent b is 0.5 for both languages, the interesting language parameter (independent of sample size) is K = M/sqrt(N), not M/N.
Rafal > 4 hours ago