Emma May Smith > 20-09-2016, 11:55 AM
(20-09-2016, 08:34 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.In general, I think the problem of defining word structure (similarly to Philip Neal's regex) is contiguous but not identical to the identification of "character classes". For instance, in Latin the phonetically similar "n" and "m" have very different positional statistics. The two letters have roughly the same number of occurrences, but "n" appears as the last letter in about 1% of the words that contain it, while "m" appears as the last letter in about 50% of the words with at least an "m".
Sam G > 20-09-2016, 08:26 PM
ThomasCoon > 21-09-2016, 02:11 AM
-JKP- > 21-09-2016, 02:57 AM
MarcoP > 04-10-2016, 08:36 AM
ThomasCoon > 04-10-2016, 02:55 PM
Sam G > 04-10-2016, 03:10 PM
(04-10-2016, 08:36 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.For each character, I have computed a set of 10 frequencies (in the range 0..1, corresponding to a percentage on the total number of occurrences of the character):Interesting... can I ask why you chose these five letters?
- occurrences before a
- occurrences after a
- occurrences before d
- occurrences after d
- occurrences before k
- occurrences after k
- occurrences before l
- occurrences after l
- occurrences before o
- occurrences after o
MarcoP > 04-10-2016, 03:26 PM
(04-10-2016, 02:55 PM)ThomasCoon Wrote: You are not allowed to view links. Register or Login to view.Thanks, Marco! I wish I was more literate in statistics, so I could understand more - could you tell me what you found in a more plaintext format?
(04-10-2016, 03:10 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.Interesting... can I ask why you chose these five letters?
(04-10-2016, 03:10 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.As far as the results, I found it surprising that g is more common after l than it is after o.
Davidsch > 04-10-2016, 03:40 PM
MarcoP > 04-10-2016, 04:57 PM
(04-10-2016, 03:40 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.Nice work Marco, please take into account that there are more positions of the characters possible.
Your view shows only those positions that you choosed,
this means that the distribution is distorted because they are not relational to All other letters and positions.