Merrimacga,
Thanks for putting so much thought into this. I think it is more efficient if I provide some comments on a few things (prompted not only by your message but others too.)
1. First and foremost, one should not get too wound up in the details of voting and scoring.
The identity of these plants --if they even represent real plants at all -- is not going to be decided by popular vote. That aspect of the application is essentially for entertainment purposes and to engage a wider audience by leveraging a basic aspect of psychology -- people love to express their opinions when it implies they know something that others don't.
2.
Familiarity with plants and Identifying them from pictures is something that 'ordinary' people can do -- it does not require a degree in botany or expertise with ancient manuscripts. That fact seems to offend some people who consider themselves experts in ancient manuscripts or botany, but we should not confuse a possible recognition of (or a popular opinion about) a plant depicted in the manuscript with its implication regarding the manuscript's author, origin, or purpose. But there are nearly 400,000 known plant species and literally millions of people participating in special interest groups around the topic of plant identification and botanical interests. (My own list of groups so far includes roughly 1.5 million). The chances that one of these illustrations can be recognized by someone outside the tiny community of Voynich enthusiasts and professional paleographers is not insignificant.
3. Defining how to quantify "consensus" is a topic that gets attention quite independently of this app because it has applications in a number of the social sciences. My definition is one on which we've given a fair bit of thought but it is not perfect because none is. It is explained on the About page to the level warranted for this application (without mathematical expressions and details that may confuse the typical user). And, given its purposes here, I'm not sure it is worth refining more.
4. That being said, I understand the attraction of getting into the nitty gritty of calculating scores to try to perfect them. I can't help doing that myself! So I appreciate your thoughts and suggestions on this.
In an earlier message, I explained a bit more details about the scoring, and as far as I can see, the examples you mention are being calculated without error. (Note that when I said "implicit" vote, I meant it is just not explicitly a vote seen in the Agree/Disagree columns.) So take your f31r-a example and the T.C. Petersen proposal:
- There is one explicit vote in agreement seen in the Agree column
- There is one implicit vote in agreement by virtue of the proposer proposing it (and which remains so long as they do not delete their proposal)
- There are 3 implicit votes of disagreement by virtue of other proposers proposing something else
- So there are 5 opinions expressed in total. Two of those are in agreement. Hence the proposal score is 2/5 = 40%.
- This is the proposal with the highest score -- the others all have 25%. So it exceeds the others by 15%, which is therefore the consensus score on the illustration overall.
I won't go further into the detailed reasoning behind my exact formulae, but below are the essential assumptions. (Note these are arbitrary. Alternative assumptions may be just as good or better, and as time permits, I'll think more about your suggestions:
- One can find enough merit in more than one proposal to agree with each of them. (So it is not a vote for THE answer, it is a vote that one agrees that a proposal *might* be wholly or partially correct.)
- Proposing a plant ought to carry a bit more gravity than just voting to agree with one and does so asymmetrically in the sense that one believes their proposal has positive merit that outweighs that of other proposals. Consequently, the proposal is considered to imply that the proposal negates the merit of any alternative proposal. (This one is *very* arbitrary, but it allows a certain mathematical continuity to the scoring.)
- A proposer can still additionally express their opinion for or against merit (+/-) of other proposals by voting on them (and effectively counter the implications of the previous assumption.)
5. Regarding multiple proposals or votes on a particular plant by the same individual...
As a registered user (logged in), one can only make a single proposal on any one plant. If one is just a guest user (not logged in), the only practical way to limit them from making multiple proposals is through their IP address. One can easily circumvent this but again -- we can't take this application too seriously. Such controls are totally inadequate for something that matters (one cannot run political voting by identifying voters by IP, for example) but it is adequate for this application.
A guest user can only make a proposal, by the way. They cannot vote to agree/disagree, nor leave comments.
6. I am aware of Ranker, but note that it is not used for quite the same purposes as here. The Ranker votes are generally on subjective topics ("what's the best....") where finding which item people like the most is the end goal and value. In this app, the item with the most votes (the "opinion of the crowd") might seem to be the actual goal (and that might indeed drive participation and input of ideas from the crowd), but as I said before, it won't actually decide the true identity of a plant illustration. The hope is that it is going to just stimulate more candidate plants and highlight which possible real-world plants get greater recognition from more people regarding the similarity of its features to those in the the illustration. The implicit goal is not the result of the consensus scores/ranking, or the actual "opinion of the crowd" (as it is with Ranker), but rather the goal is the generation of ideas and candidates (regarding just one of several different aspects of the manuscript) for comprehensive analysis and integration into the broader research on the document.
7. I can't thank you enough for lists of resources that you have provided. I do plan to list them on the About page -- the couple of resources currently there are not intended to be final. They were just initial placeholders for the topic of "Additional Resources" and we wanted to get on with deploying a beta to begin getting some feedback.
8. Regarding my own background, I have my own consulting LLC, but beyond borrowing an email account to facilitate communications, this application is unrelated to any consulting or services. The application, and a few other projects, are being supported through my private research entity, QuantumLynx Research. I haven't put additional information on my own background because it isn't relevant and I am not the only person involved anyway. Also, since this project is not for profit but will eventually require a bit of funding to support it, it is important to disassociate from any commercial concerns as much as possible. Besides the costs to develop the application so far, it costs a small amount to maintain the production servers and database and, as stated on the site, any donations will go only toward this application and keeping it available. I will personally cover those costs for a while but it will at some point depend on keeping it funded. (By the way, the Voynich images you see on my website are simply the Beinecke hires images and that is from an unrelated project that I did some time ago.)
Again thanks for your input and suggestions.