28-04-2024, 07:35 AM
* Contrarian view #1: Don't bother sweating the "weirdos".
Having built scripts to convert both the L-Z EVA-based transcription and the v101 transcription to the Currier alphabet, my recollection is that the fraction of glyphs for which there is not an unambiguous Currier equivalent (at least with regard to the running text in the initial herbal quires and the bio section) is roughly half a percent. I'll double check, but I'm pretty sure that's right (which is not to say that the transcriptions agree with each other at that level). That means "basic EVA"/Currier/just the vanilla ASCII bits of v101 captures roughly 199 out of every 200 glyphs. That should be good enough to read the text (if there is a text to read) -- and if it's not, then I would argue that there's no point in worrying about it.
To be clear, this is a pragmatic claim not a theoretical one. If the question is "is it possible that reading the text requires capturing every nuance of every 'weirdo' in the text?", then I have to agree that yes, abstractly it is possible. The text could be generated in some way that has some kind of state such that unless we capture all the weirdos we'll fail in trying to read it. I don't think I've ever seen anyone make a compelling case that the bulk statistics of the text make that likely, but it's possible.
Pragmatically, if that's the case then I think that without some additional side channel of information -- finding a "bilingual" document enabling a known plaintext attack, for instance -- we might as well throw in the towel. Which makes investing large amounts of effort in encoding "weirdos" (as opposed to just marking them with something like the Currier alphabet's '*' "here be a dragon" character) an unproductive use of time. Which means "basic EVA"/Currier/just the vanilla ASCII bits of v101 should be good enough.
That's not the same thing as saying that there isn't room for argument over whether "basic EVA" (for instance) is capturing the right equivalence classes of groups of ink strokes. I've seen people claim that whether an 'a' is closed at the top or not matters, for example -- but that's a different issue.
* Contrarian view #2: For the sake of all that's good and bright and beautiful in the universe, can we please, please, please stop using EVA?
While I have never loathed EVA with the blazing white-hot passionate hatred that Glen Claston did (and anyone who thinks I'm exaggerating can go read his Voynich mailing list remarks on the subject), I just don't see the argument for "why EVA?". Granting the premise that there is value in an "analytic" transcription that is neutral about how to read the ligatured gallows or word-final i*<x> sequences, I fail to see why EVA is that transcription -- and in particular, I see no reason to prefer it to Frogguy (You are not allowed to view links. Register or Login to view.):
1) I have never understood the virtue of prioritizing making the transcription pronounceable over visual resemblance to the script. I mean, sure, a 'd' kind of looks like an '8' with the upper loop squished, and a 'y' kind of looks like a '9' without a closed top loop, and a 'q' kind of looks like a '4' written by someone who hates corners, but...why? According to You are not allowed to view links. Register or Login to view. it's to help make common words easy to recognize and remember. I suppose this is one of those "your mileage may vary" things.
2) In fact, the pronounceability of EVA has had the unfortunate effect of a non-trivial number of naive newcomers to MS 408 thinking there is actual significance to the phonetic values in the EVA transcription scheme. I realize that the people behind EVA didn't intend that to be the case, and are explicit in various places in making clear it isn't, but if someone just grabs a transcription file without "reading the manual" that doesn't help.
3) The clear advantage of Frogguy is that the learning curve is truly minimal. The gallows, for example, are 'lp', 'qp', 'lj', and 'qj' -- and anyone who has seen the actual text should immediately grok which is which...
4) As Rene says on the page referenced above, "It is very important to point out that Eva is not attempting to identify semantic units in the text. It simply represents in an electronic form the shapes that are seen in the MS. It is left to a later step by analysts to decide which combinations should be seen as units." If you're going to have to transform the transcription to do meaningful analysis anyways, why not do it from something that maximizes the fluency of transcription with a lower learning curve (and probably lower transcription error rate)?
I think that's probably enough of me being a curmudgeon for the evening...
Karl
(PS, coming soon -- the Midsomer Murders MS 408-themed fanfic you never realized you needed. When a visitor researching a possible connection between Midsomer and the mysterious Voynich manuscript is found murdered at a Voynich-inspired spa & herbal treatment center, Winter and Barnaby have to decode the killer's motive before there are more deaths. How many more victims will die before they succeed in...Deciphering Murder?)
Having built scripts to convert both the L-Z EVA-based transcription and the v101 transcription to the Currier alphabet, my recollection is that the fraction of glyphs for which there is not an unambiguous Currier equivalent (at least with regard to the running text in the initial herbal quires and the bio section) is roughly half a percent. I'll double check, but I'm pretty sure that's right (which is not to say that the transcriptions agree with each other at that level). That means "basic EVA"/Currier/just the vanilla ASCII bits of v101 captures roughly 199 out of every 200 glyphs. That should be good enough to read the text (if there is a text to read) -- and if it's not, then I would argue that there's no point in worrying about it.
To be clear, this is a pragmatic claim not a theoretical one. If the question is "is it possible that reading the text requires capturing every nuance of every 'weirdo' in the text?", then I have to agree that yes, abstractly it is possible. The text could be generated in some way that has some kind of state such that unless we capture all the weirdos we'll fail in trying to read it. I don't think I've ever seen anyone make a compelling case that the bulk statistics of the text make that likely, but it's possible.
Pragmatically, if that's the case then I think that without some additional side channel of information -- finding a "bilingual" document enabling a known plaintext attack, for instance -- we might as well throw in the towel. Which makes investing large amounts of effort in encoding "weirdos" (as opposed to just marking them with something like the Currier alphabet's '*' "here be a dragon" character) an unproductive use of time. Which means "basic EVA"/Currier/just the vanilla ASCII bits of v101 should be good enough.
That's not the same thing as saying that there isn't room for argument over whether "basic EVA" (for instance) is capturing the right equivalence classes of groups of ink strokes. I've seen people claim that whether an 'a' is closed at the top or not matters, for example -- but that's a different issue.
* Contrarian view #2: For the sake of all that's good and bright and beautiful in the universe, can we please, please, please stop using EVA?
While I have never loathed EVA with the blazing white-hot passionate hatred that Glen Claston did (and anyone who thinks I'm exaggerating can go read his Voynich mailing list remarks on the subject), I just don't see the argument for "why EVA?". Granting the premise that there is value in an "analytic" transcription that is neutral about how to read the ligatured gallows or word-final i*<x> sequences, I fail to see why EVA is that transcription -- and in particular, I see no reason to prefer it to Frogguy (You are not allowed to view links. Register or Login to view.):
1) I have never understood the virtue of prioritizing making the transcription pronounceable over visual resemblance to the script. I mean, sure, a 'd' kind of looks like an '8' with the upper loop squished, and a 'y' kind of looks like a '9' without a closed top loop, and a 'q' kind of looks like a '4' written by someone who hates corners, but...why? According to You are not allowed to view links. Register or Login to view. it's to help make common words easy to recognize and remember. I suppose this is one of those "your mileage may vary" things.
2) In fact, the pronounceability of EVA has had the unfortunate effect of a non-trivial number of naive newcomers to MS 408 thinking there is actual significance to the phonetic values in the EVA transcription scheme. I realize that the people behind EVA didn't intend that to be the case, and are explicit in various places in making clear it isn't, but if someone just grabs a transcription file without "reading the manual" that doesn't help.
3) The clear advantage of Frogguy is that the learning curve is truly minimal. The gallows, for example, are 'lp', 'qp', 'lj', and 'qj' -- and anyone who has seen the actual text should immediately grok which is which...
4) As Rene says on the page referenced above, "It is very important to point out that Eva is not attempting to identify semantic units in the text. It simply represents in an electronic form the shapes that are seen in the MS. It is left to a later step by analysts to decide which combinations should be seen as units." If you're going to have to transform the transcription to do meaningful analysis anyways, why not do it from something that maximizes the fluency of transcription with a lower learning curve (and probably lower transcription error rate)?
I think that's probably enough of me being a curmudgeon for the evening...
Karl
(PS, coming soon -- the Midsomer Murders MS 408-themed fanfic you never realized you needed. When a visitor researching a possible connection between Midsomer and the mysterious Voynich manuscript is found murdered at a Voynich-inspired spa & herbal treatment center, Winter and Barnaby have to decode the killer's motive before there are more deaths. How many more victims will die before they succeed in...Deciphering Murder?)