Labyrinthinesecurity > 07-05-2026, 10:36 AM
oshfdk > 07-05-2026, 10:58 AM
(07-05-2026, 10:36 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.The improved grammar (F1=0.242) outperforms Zattera's grammar (F1=0.214) when both are scored on our corpus. However, Zattera reports F1=0.270 on his own filtered corpus (~5,105 types), and our grammar was F1-trained on the test corpus while his was not.
Labyrinthinesecurity > 07-05-2026, 11:19 AM
(07-05-2026, 10:58 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:36 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.The improved grammar (F1=0.242) outperforms Zattera's grammar (F1=0.214) when both are scored on our corpus. However, Zattera reports F1=0.270 on his own filtered corpus (~5,105 types), and our grammar was F1-trained on the test corpus while his was not.
What are the You are not allowed to view links. Register or Login to view.? I don't think F1 is informative when comparing two different grammar mechanics. Whatever "switchable templates" are they sound like they can capture some information too, leading to a visibly better numbers for no actual improvement in the grammar.
Labyrinthinesecurity > 07-05-2026, 10:55 PM
(07-05-2026, 10:58 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:36 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.The improved grammar (F1=0.242) outperforms Zattera's grammar (F1=0.214) when both are scored on our corpus. However, Zattera reports F1=0.270 on his own filtered corpus (~5,105 types), and our grammar was F1-trained on the test corpus while his was not.
What are the You are not allowed to view links. Register or Login to view.? I don't think F1 is informative when comparing two different grammar mechanics. Whatever "switchable templates" are they sound like they can capture some information too, leading to visibly better numbers for no actual improvement in the grammar.
oshfdk > 07-05-2026, 11:05 PM
(07-05-2026, 10:55 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.Currently my best Zattera model, Zat+, beats Loop-Lay in terms of Nbits, but for a coverage of about 80% versus 100% for Loop-Lay. However due to the fat tail distribution of Voynisch words with potétial scribal errors, 100% coverage does not make much sense. I think we should settle for a much lower cocerage and optimize Nbits from that.
Jorge_Stolfi > 08-05-2026, 09:22 AM
(07-05-2026, 11:05 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.I really have no opinion about the best coverage. Personally, I think the grammar should reproduce all words (maybe with the exception of those with really rare characters like v), otherwise it's not a grammar of Voynichese, but a grammar of some subset of Voynichese, and all conclusions made based on this grammar should carry this footnote with them.
Quote:I think the main idea of loop grammars was to handle all or almost all words while keeping the grammar size manageable.
oshfdk > 08-05-2026, 10:10 AM
(08-05-2026, 09:22 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.But what is the purpose of the grammar?
Mauro > 09-05-2026, 09:11 PM
(07-05-2026, 10:55 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:58 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:36 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.The improved grammar (F1=0.242) outperforms Zattera's grammar (F1=0.214) when both are scored on our corpus. However, Zattera reports F1=0.270 on his own filtered corpus (~5,105 types), and our grammar was F1-trained on the test corpus while his was not.
What are the You are not allowed to view links. Register or Login to view.? I don't think F1 is informative when comparing two different grammar mechanics. Whatever "switchable templates" are they sound like they can capture some information too, leading to visibly better numbers for no actual improvement in the grammar.
Currently my best Zattera model, Zat+, beats Loop-Lay in terms of Nbits, but for a coverage of about 80% versus 100% for Loop-Lay. However due to the fat tail distribution of Voynisch words with potétial scribal errors, 100% coverage does not make much sense. I think we should settle for a much lower cocerage and optimize Nbits from that.
Labyrinthinesecurity > 11-05-2026, 03:20 PM
(09-05-2026, 09:11 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:55 PM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:58 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.(07-05-2026, 10:36 AM)Labyrinthinesecurity Wrote: You are not allowed to view links. Register or Login to view.The improved grammar (F1=0.242) outperforms Zattera's grammar (F1=0.214) when both are scored on our corpus. However, Zattera reports F1=0.270 on his own filtered corpus (~5,105 types), and our grammar was F1-trained on the test corpus while his was not.
What are the You are not allowed to view links. Register or Login to view.? I don't think F1 is informative when comparing two different grammar mechanics. Whatever "switchable templates" are they sound like they can capture some information too, leading to visibly better numbers for no actual improvement in the grammar.
Currently my best Zattera model, Zat+, beats Loop-Lay in terms of Nbits, but for a coverage of about 80% versus 100% for Loop-Lay. However due to the fat tail distribution of Voynisch words with potétial scribal errors, 100% coverage does not make much sense. I think we should settle for a much lower cocerage and optimize Nbits from that.
Which target should be set for coverage is not clear. I experimented with different targets for coverage, but with little results. I think 80% is a bit low, but this is just a personal opinion. On the other side, going above 95% probably captures more noise than data.
Can you publish here your improved garmmar? I'd be very interested to see it.
Mauro > 11-05-2026, 06:12 PM