Davidsch > 18-02-2017, 10:44 PM
ReneZ > 19-02-2017, 08:26 AM
Torsten > 19-02-2017, 10:10 AM
(19-02-2017, 08:26 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Many thanks for the effort!
I don't think that the conclusion can be drawn just yet.
The figure is defined by edit distance, and the edit distance of course depends fully on orthography.
For this Chinese text, upper case and lower case have been kept seperate, and as a result the very frequent
word 'wo3' ends up in two spots at almost opposite points in the figure.
The intonation can be represented in different ways. Here, it is done with diacritics over the vowel, which is of course a valid option, but I wonder (from a closer look at the figure) if it was consistent.
Also, this graph considers composite words (two syllables) while the structure in Mandarin occurs at the level
of syllables.
The problem with these types of experiments is exactly the large number of permutations that one could try.
Sam G > 19-02-2017, 10:47 AM
Torsten > 19-02-2017, 10:51 AM
(19-02-2017, 10:47 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Mandarin Chinese would only work if you split the words into separate syllables, as I stated earlier when I pointed out the similarity.
Vietnamese would probably be a better example, since it's already written this way yet also has a restricted phonotactic structure. Conveniently it is also written using a Roman alphabet-based orthography, so it probably wouldn't be hard to download a book-length Vietnamese text and generate a graph showing the word network encompassing all the words.
Sam G > 19-02-2017, 11:25 AM
Sam G > 19-02-2017, 11:30 AM
Davidsch > 20-02-2017, 11:28 AM
Torsten > 20-02-2017, 08:15 PM
(19-02-2017, 11:30 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Actually, that last one is a translation of an English novel and has a lot of foreign names in it. This book might be better:
You are not allowed to view links. Register or Login to view.
Here's a zip file of the book:
You are not allowed to view links. Register or Login to view.
Also, there are some Vietnamese books here, but they're in ePub and Mobi format:
You are not allowed to view links. Register or Login to view.
Torsten > 20-02-2017, 08:53 PM
(20-02-2017, 11:28 AM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.Although nice to see, the method of showing many statistical tables with numbers,Interesting to see that lines using a word up to eight times exist in the VMS. Maybe I should use it to illustrate this fact.
and nice artificial network pictures does not show that the "auto copy theory" is based on anything scientific: it is still a picture and nothing more.
I could easily show big pictures on the contrary as well.