Mauro > 29-11-2024, 10:08 AM
(29-11-2024, 08:45 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Interesting, thank you!
At first I was a bit confused as the 'first missing word' did not match mine, but I based the checks on the reference transliteration (RF-1a).
My present thinking (as reflected in the music paper) goes more into the direction of a looped grammar.
I called the result of each loop a 'word chunk'.
However, I do not yet have a good result.
I wondered after first seeing M.Zattera's work, if the efficiency figure is not penalising the results too much.
After all, a perfect word generation rule should not be expected to exist.
But I also don't have a better suggestion.
oshfdk > 29-11-2024, 11:31 AM
(27-11-2024, 04:19 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.However, I have recently developed a word grammar or, better, a family of grammars, which I would like to share, together with a comparison with the grammars proposed by ThomasCoon, Zattera and Stolfi.
nablator > 29-11-2024, 01:24 PM
(29-11-2024, 10:08 AM)Mauro Wrote: You are not allowed to view links. Register or Login to view.(voynichese.com transcription, words with 'rare' charactes ('g', 'x', 'v', 'z' and 'c', 'h' appearing alone) excluded, 7700 total words remaining)
Mauro > 29-11-2024, 01:41 PM
(29-11-2024, 11:31 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(27-11-2024, 04:19 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.However, I have recently developed a word grammar or, better, a family of grammars, which I would like to share, together with a comparison with the grammars proposed by ThomasCoon, Zattera and Stolfi.
Looks great! I'd like to better understand the implications of your results wrt statistical properties of Voynichese. Could you publish the wordlist that was used as the basis of the grammar/efficiency computations? Sorry for my ignorance if there is already some "standard" list of words for this task. I assume I could just take the EVA file and split by periods, but then there are many variables to consider: like what to do with ambiguous readings, ligatures, half spaces, weirdos, etc.
oshfdk > 29-11-2024, 02:38 PM
(29-11-2024, 01:41 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.Sure I can publish the words lists (and the raw outputs of the various grammar). Now I have them in Excel files, would it be okay if I upload them onto Google? (I ask this because Excel files are often seen with suspicion, they can contain dangerous macros, but there are no macros in those files).
Mauro > 29-11-2024, 03:53 PM
(29-11-2024, 02:38 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(29-11-2024, 01:41 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.Sure I can publish the words lists (and the raw outputs of the various grammar). Now I have them in Excel files, would it be okay if I upload them onto Google? (I ask this because Excel files are often seen with suspicion, they can contain dangerous macros, but there are no macros in those files).
Excel will do, thanks!
I have a question, is my understanding correct that when you quote/compute the coverage you are referring to word type coverage and not word token coverage? I wonder what word token coverage would look like. E.g., from my point of view a grammar that successfully covers 95% of the text tokens by fully incorporating 60% of the most frequent words has more merit than a model covering 95% of words but missing some of the frequent ones. After all, rare words are much more likely to be scribal errors, ambiguous writing, etc.
Mauro > 29-11-2024, 04:15 PM
oshfdk > 29-11-2024, 06:45 PM
Mauro > 29-11-2024, 07:36 PM
(29-11-2024, 06:45 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(29-11-2024, 04:15 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.
Thank you! I just needed the wordlist to understand what's covered by this grammar, your Excel file should be perfect for that.
I'm not sure I understand the implications of the qualitative analysis you added in one of the previous posts. I'm not very familiar with grammars, I have some basic understanding of how they work in, say, describing programming languages, but I never had to deal with a grammar as a data analysis tool. What kinds of conclusions can be made from the fact that your grammars achieve good scores? Is it possible to somehow identify functional elements or gain some understanding of actual character boundaries (e.g., whether ch, iin, qo should be treated as single entities)?
nablator > 29-11-2024, 07:50 PM
(29-11-2024, 01:41 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.And surely I can try with the RF1a-n transcription. Just, can you point me to a link? Ideally it should be a single .txt file without any metadata or added remarks (that would save a lot of asinine work).