The Voynich Ninja

Pages: 1 2 3 4 5

I have a question about the ID score. In the identification to the You are not allowed to view links. Register or Login to view. I have an ID score of 0%. Can this be ?

(16-09-2023, 06:24 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I have a question about the ID score. In the identification to the You are not allowed to view links. Register or Login to view. I have an ID score of 0%. Can this be ?

You are correct that your score cannot be zero -- it must be >0% and <=100%.
I do not see this on your proposal though. It is at 33% right now (due to your proposal + one positive vote and 4 implicit votes against due to the other 4 proposals.)
But your seeing 0% may have been due to a slow score update/re-calculation (or faulty trigger to update). We'll keep an out out for this happening again in case it is a bug.

Ok, now it fits. However, the value ( Consensus ) in the gallery under the other plants is often 0%. Shouldn't something else also be written there most of the time ? Or is only counted when a vote is submitted ( > 0 ) ?

[attachment=7582]

I wonder what they are trying to achieve with this.
If already 2 out of 3 of the mentioned authors do not fulfil the basic requirements (not from America, because of the C-14 analysis) and the third uses the whole globe as a possibility.
Therefore, everything is already wrong.
It took years to point out such details to people.

(17-09-2023, 10:22 AM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.Ok, now it fits. However, the value ( Consensus ) in the gallery under the other plants is often 0%. Shouldn't something else also be written there most of the time ? Or is only counted when a vote is submitted ( > 0 )?

Good question.
The calculation for the Consensus score is an arbitrary one (as there is no official standard way to quantify consensus). Its definition and that of the individual Proposal ID Scores are explained on the About page You are not allowed to view links. Register or Login to view.. And I agree, it is imperfect.
The Consensus score becomes non-zero once there are at least two proposals and one of them shows a greater agreement than the other(s).
I think this makes sense when there is more than one proposal and they are all "tied" --- since, in that situation, there is no one answer that people tend to agree is best. And these are the cases where you are seeing 0%.

But your concern brings up some good points...
When there is only a single proposal, then the Consensus score takes on the same value as the Proposal ID Score of that proposal. In this case, unless there is at least one vote in disagreement with that proposal, then the score jumps to a full 100%. Now if a million people agree with the only proposal on the table and no one disagrees, it does seem okay to say that the consensus is essentially "100%". But what if there are only a few people expressing any opinion on the matter and they all agree? Or what if there is only a single person making any proposal at all and no one disagrees with it? Do we call that a 100% consensus? Arguably not.

But, as a practical matter, we need to decide what population has a stake in determining consensus. The only answer that I can only come up with is this: Out of all the population (of the world) available, we can only consider those who have expressed *some* opinion on the matter. If that is just a single person saying they agree, and no one is expressing a disagreement, or even proposing an alternative, then it seems we have no choice but to call that a 100% consensus. And further agreements don't change that answer. Only once there is some expression of disagreement does consensus fall below 100%.

I hope this makes sense.

(17-09-2023, 04:05 PM)asteckley Wrote: You are not allowed to view links. Register or Login to view.When there is only a single proposal, then the Consensus score takes on the same value as the Proposal ID Score of that proposal. In this case, unless there is at least one vote in disagreement with that proposal, then the score jumps to a full 100%. Now if a million people agree with the only proposal on the table and no one disagrees, it does seem okay to say that the consensus is essentially "100%". But what if there are only a few people expressing any opinion on the matter and they all agree? Or what if there is only a single person making any proposal at all and no one disagrees with it? Do we call that a 100% consensus? Arguably not.

You may include the total number of proposals along with the % of Consensus score. Similar to rate something along with the total number of people who rated it.

(16-09-2023, 03:50 PM)asteckley Wrote: You are not allowed to view links. Register or Login to view.To add a bit more detail to how it was done:

We...invoke AI Models (e.g. ChatGPT) to augment each one with scientific names, other common names, ranges, etc....

First: great idea for a project, and I look forward to seeing it develop.

Second: do not -- do not, do not, DO NOT! -- trust any factual assertion made by ChatGPT/etc. without independently confirming its correctness. ChatGPT fabricates "facts" far too frequently to trust without verification. If you don't check what it's telling you you risk spreading misinformation, which is that last thing the study of MS 408 needs more of...

Karl

(17-09-2023, 07:40 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.
(16-09-2023, 03:50 PM)asteckley Wrote: You are not allowed to view links. Register or Login to view.To add a bit more detail to how it was done:

We...invoke AI Models (e.g. ChatGPT) to augment each one with scientific names, other common names, ranges, etc....

First: great idea for a project, and I look forward to seeing it develop.

Second: do not -- do not, do not, DO NOT! -- trust any factual assertion made by ChatGPT/etc. without independently confirming its correctness. ChatGPT fabricates "facts" far too frequently to trust without verification. If you don't check what it's telling you you risk spreading misinformation, which is that last thing the study of MS 408 needs more of...

Karl

Lol.. Thanks for waving that red flag Karl, and you are right, in general, to do so.

I am in fact an AI professional, who has been outspoken on that very subject -- particularly when LLM's got so much public attention when ChatGPT was first released. At that time, I was especially concerned that people -- even some prominent well-known experts -- were not understanding the limitations of LLM's which are ultimately just "statistical predictors of the next most likely word" (to put it in simple terms.)
ChatGPT and other LLM's can indeed hallucinate and they cannot reason (again despite the claims of some very well-respected people like Andrew Ng, and Geoffrey Hinton), and they should never be used to produce text whose factual accuracy must be depended on.

That is not how it is used in relation to this application, however. Both the devil and the angel are in the details.

First, the tendency for an LLM to hallucinate is greatly reduced if you deal with short questions whose answers are both short and highly likely to be found in the training corpus. (By short, we mean hundreds as opposed to thousands of tokens.) The more likely the correct answer exists in an explicit form in the data that was used to train the model, the more likely the model will "predict" the right words of that answer and provide a correct response. But the more likely that the facts of what you are asking for are unavailable to it or must be synthesized from a lot of data spread across large portions of source text, the more likely the moddel will simply cobble together words to generate something that is just made up -- that is "hallucinate".

Second, the use of it here is to provide some additional, non-critical, information to the data from historical researchers. And the important adjective there is "non-critical". (The consequence of a piece of data being factually wrong is minor and inconsequential to the overall purposes.) We actually inspected the quality of the information it was providing for our purposes. It is not always correct or optimal. (Ironically GPT-4 often produces less useful answers than GPT-3.5). However, it was apparent that the quality of the information that it added far exceeded the quality of the proposed information from historical researchers, which was often misspelled, incomplete, or just referred to plant names that do not exist. So including it to provide background information on what a historical researcher had proposed was a net plus. (And provided a pragmatic starting point from which corrections can be reported and made.)

This, by the way, is one reason why, within the application itself, to provide helpful information to a person making a proposal, we have chosen not to use an LLM. Instead, we have provided a button that they can use to check plant names against a conventional database -- the Global Biodiversity Information Facility. And they can opt to ignore or adopt that information. (This is found on the proposal form when one clicks "Do you recognize this plant?").

(17-09-2023, 06:06 PM)Juan_Sali Wrote: You are not allowed to view links. Register or Login to view.You may include the total number of proposals along with the % of Consensus score. Similar to rate something along with the total number of people who rated it.

That's not a bad idea Juan. The total number of people who propose and vote is apparent from the proposal table, but putting the number of proposals in the gallery beside the consensus score of each plant image is helpful. I have made that change in the app.

Thanks!

I can see a number of possible issues with the ranking system.

First, judging by what I have read here on this site and especially if you add aspect ranking in addition to the overall ranking each plant (as per my earlier suggestion), a lot of people who propose plants would not only vote agree on the plant they propose but would do so on others as well, perhaps including ones already listed prior to their proposal. In which case, to have a vote for one's proposal count as a simultaneous vote against all other proposals is flawed at best. The more transparent your methodology, the more trust it will engender and encourage participation as well as trust in the eventual results reporting. It would be more transparent and more accurate to either (a) only count a person's proposal as their vote for their plant without counting it as a vote against all other plant proposals OR (b) not assign any votes by the person proposing at the time they submit a plant proposal (they could then vote for it, and against others if they so choose, after submission).

The current methodology raises questions: Once a person proposes a plant, could they then vote agree on any of the other pre-existing plant IDs for the same VM plant and, if so, what happens to their implicit votes recorded at the time of proposal? Additionally, should other plant IDs for the same VM plant be added later, does the system implicitly count each previous proposer as a vote against the newer plant ID and would each previous proposer then be able to vote for the new plant ID?

Using bi3mw's plant ID proposal on VM plant f34v-a as an example, as of this post, you can still see 2 votes for and 0 votes against that ID and no votes on any of the other proposed IDs. In your follow-up to bi3mw's ID score question, you noted the 2 votes as being 1 for the proposal and 1 additional vote for, implying that the implicit vote for by the proposer will appear in the vote counts. If that is true, then why isn't there a vote for shown on each of the other 4 plant IDs? Is this because they were part of the original database build and those have no implicit votes? And why aren't bi3mw's implied votes against the other plant IDs shown either? This goes towards the question of transparency.

Using VM plant f31r-a as another example, there are 4 plant IDs, with all appearing to be part of the original database build, with 1 vote for the first ID, no other votes for, and no votes at all against. Yet, the ID scores are 40% for the first one and 25% each for the others and a consensus score of 15%. I'm wondering how, if at all, implied votes affected this and also how the total percentage in the proposal ID scores can exceed 100%. On that last part, I'm guessing that original ID scores don't update unless and until there are votes for or against them shown. Is that true?

In addition to these questions pertaining to a single ID proposal being made, can a person propose multiple plant IDs for a single VM plant? If so, what happens to their implied votes for and against on each of their proposed plant IDs?

Moving on to other votes, can a person who is just voting on, as opposed to proposing, each plant ID of each VM plant vote more than once (i.e. change their opinion)? If so, what happens to their previous votes?

Also, I know this is currently in a beta phase. Presumably, that means eventually the application will be ready for widespread voting once all bugs have been fixed and any other corrections and changes have been made. At that point, will you freeze the plant ID options so no further proposals can be made? If so, will it then still be possible to add more information and references to each ID to further aid in selection? I ask because as long as more can be added, the more likely it will be that people will change their votes multiple times. That is not necessarily a bad thing, depending on how long you can or want to keep the application running and how the application handles re-ranking by voters.

I almost hate to come back to You are not allowed to view links. Register or Login to view. again - it is SO social media focused - but since that site is also crowdsourced opinion ranking and well established (since 2009), they have good examples you can review that address some of the questions I have noted above in case you want to consider some related changes while you're still in the beta phase. Although Ranker simply notes they have an algorithm they use and their explanations about it are rather vague, you may still be able to glean something from both their help info and sample ranking lists. Below are some links you may find useful. Try to ignore the content and focus on the features. All Ranker lists active for voting permit users to change their votes. Note: I have excluded lists from the samples that were articles without voting, or lists whose voting is closed, or lists that are active but were intentionally launched with a set list of items.

You are not allowed to view links. Register or Login to view.

You are not allowed to view links. Register or Login to view.

List: International Mysteries We Really Want Solved (June 2022): You are not allowed to view links. Register or Login to view.
The only sample list that included the Voynich Manuscript (ranked 4 of 9). 4.0K votes, 984 voters, 32.0K views. Still an active list for voting.

List: The Best Grass for Dogs (March 2019): You are not allowed to view links. Register or Login to view.
One of two sample lists that can be added to. 11 items, 842 votes, 453 voters. Still an active list for voting. Option to add a new item is at the bottom of the list. Also, it hasn't been re-ranked yet but does include a button to re-rank the list one's own way, which provides a pop-up so one can drag each item to re-sort it to a new ranking. That ranking would then appear with one's username in a dropdown button on the page. Such a feature, if possible, might not be needed in the Voynich Garden, unless any VM plants eventually contain a large number of plant IDs. Note the demographic filter buttons and search field at the top. These don't appear on all Ranker lists but do on a lot of them. You may not want to consider the demographic filter either, even if possible, but I would imagine even to consider it for results reporting, let alone for user filtering, you would need to capture that data in registered user profiles in Voynich Garden.

List: The Best Plant Nursery Websites (December 2019): You are not allowed to view links. Register or Login to view.
The second sample that can be added to, option to add at the bottom of the list. 37 items, 1.3K votes, 560 voters. Still an active list for voting. Not yet re-ranked but button at top along with demographic filter buttons and a search field.

List: The Best Trees in Fiction (July 2019): You are not allowed to view links. Register or Login to view.
15 items, 1.4K votes, 371 voters. Still active for voting. Not yet re-ranked but rank your way button and demographic buttons at top, though no search and list can't be added to.

List: Tasty Root Vegetables (June 2023): You are not allowed to view links. Register or Login to view.
36 items, 7.9K votes, 1.8K voters, 156.0K views. Still active for voting. Not yet re-ranked but rank your way button, demographic buttons and search at top, though list can't be added to.

List: The Tastiest Vegetables Everyone Loves Eating (June 2023): You are not allowed to view links. Register or Login to view.
58 items, 186.1K votes, 13.3K voters, 417.7K views. Still active for voting. Rank your way button, re-ranking dropdown, demographic buttons and search at top but list can't be added to.

On a separate note, it couldn't hurt, may even help participation (and maybe even get you some donations - yes, I noticed that link), if you could add a little background about yourself and your background on the About page. Your primary business site is You are not allowed to view links. Register or Login to view., right? Perhaps you could include a link to that on the About page as well. I noticed you have a Voynich project listed there that looks like it contains the base images you used to create the VM plant images in the Voynich Garden site.

Pages: 1 2 3 4 5

bi3mw

asteckley

bi3mw

Aga Tentakulus

asteckley

Juan_Sali

kckluge

asteckley

asteckley

merrimacga