Options

Repetition of words

Index
Repetition of words
RE: Repetition of words

Mauro > Yesterday, 04:40 PM

I imagine the pain St. Thomas Aquinas had to endure when he wrote his Summa Theologica with such a limited vocabulary...
RE: Repetition of words

Kaybo > Yesterday, 08:30 PM

(Yesterday, 04:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.I imagine the pain St. Thomas Aquinas had to endure when he wrote his Summa Theologica with such a limited vocabulary...

Just to clarify for myself, the voynich transcript has a lot of different words. More than usually used in Latin? I am mean the whole text has 8000 different words according to forum.
RE: Repetition of words

Mauro > Yesterday, 10:08 PM

(Yesterday, 08:30 PM)Kaybo Wrote: You are not allowed to view links. Register or Login to view.
(Yesterday, 04:40 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.I imagine the pain St. Thomas Aquinas had to endure when he wrote his Summa Theologica with such a limited vocabulary...

Just to clarify for myself, the voynich transcript has a lot of different words. More than usually used in Latin? I am mean the whole text has 8000 different words according to forum.

The VMS (*) has 38411 words (**) in total, and 8424 unique words (***), that is to say 1 word type every ~4.56 word tokens. This is just a slightly higher percentage than for instance Caesar's De Bello Gallico (1 word type every ~4.67 word tokens). And consider De Bello Gallico is a much longer text (~51000 word tokens in total), and with longer texts the percentage of unique word types is expected to decrease. I'd rather say De Bello Gallico is slightly more varied in 'vocabulary' than the VMS, if any.

(*) Rf1a-n transcription, words with question marks removed
(**) word tokens
(***) word types
RE: Repetition of words

Jorge_Stolfi > 10 hours ago

(Yesterday, 10:08 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.The VMS (*) has 38411 words (**) in total, and 8424 unique words (***), that is to say 1 word type every ~4.56 word tokens. This is just a slightly higher percentage than for instance Caesar's De Bello Gallico (1 word type every ~4.67 word tokens).

If a language follows Zipf's law, the token/lexeme ratio in a sample cannot be a constant. As the number N of tokens (word occurrences) in a sample increases, the number M of lexemes (distinct words) grows like K*sqrt(N). More precisely like K*N**b where b is typically between 0.4 and 0.6. This formula is known as You are not allowed to view links. Register or Login to view..

So, when comparing the VMS lexicon size to that of other languages, it is important to use samples with the same number of tokens.

Assuming the exponent b is 0.5 for both languages, the interesting language parameter (independent of sample size) is K = M/sqrt(N), not M/N.

All the best, --stolfi

All the best
RE: Repetition of words

Mauro > 7 hours ago

(10 hours ago)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.If a language follows Zipf's law, the token/lexeme ratio in a sample cannot be a constant. As the number N of tokens (word occurrences) in a sample increases, the number M of lexemes (distinct words) grows like K*sqrt(N). More precisely like K*N**b where b is typically between 0.4 and 0.6. This formula is known as You are not allowed to view links. Register or Login to view..

So, when comparing the VMS lexicon size to that of other languages, it is important to use samples with the same number of tokens.

Assuming the exponent b is 0.5 for both languages, the interesting language parameter (independent of sample size) is K = M/sqrt(N), not M/N.

Thank you, I didn't know about Heaps' law.
RE: Repetition of words

Rafal > 4 hours ago

Actually there is a big arictle on Wikipedia about repeated words in different languages:
You are not allowed to view links. Register or Login to view.

I would say that repeated words are quite common but rather not in European languages. They are often used to mean plural case.
Next Oldest Next Newest

Repetition of words

Index

RE: Repetition of words

RE: Repetition of words

RE: Repetition of words

RE: Repetition of words

RE: Repetition of words

RE: Repetition of words