The Voynich Ninja

Full Version: Need advice for testing of hypotheses related to the self-citation method
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10
(06-07-2025, 05:33 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.But it's true that, many times, also the probabilities of the evidence vs. H and vs. notH are hard to pin down, even broadly. There are however some methods which can be used (ie. Laplace's rule of succession, the reference class method) which minimize subjectivity, but it's not the case to discuss them (nor Bayesian logic in general) here.

I was under impression that this thread is about testing hypotheses using Bayesian logic, so this seems on topic to me?

(06-07-2025, 05:33 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.To apply Bayes one does not need a prior probability at all. Indeed, discriminating between prior and evidence is ihmo unnecessary and misleading. Start from a condition of absolutely zero knowledge, that is to say P(H) = 0.5 and P(notH) = 0.5 This is the only 'prior' one needs, and it's trivial. Then start factoring in every single piece of knowledge you have: they are all evidences now.

I'm not very familiar with the Bayesian logic nor I ever used it in practice, as far as I can remember.

Suppose I make a small 1pp mistake in assessing each of individual 100 pieces of knowledge I have. Wouldn't the error after applying the rule 100 times become enormous?

What is more, I'm not sure it's easy to apply the rule if you have 100 pieces of knowledge that may depend on one another in various ways. If I'm not mistaken, for 100 pieces of knowledge you'll need pairwise conditional probabilities to get a proper result, and an exponential increase in complexity?

I don't know if Bayesian logic is of much use in cases where there is no numeric empirical data.
(06-07-2025, 05:33 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.To apply Bayes one does not need a prior probability at all. Indeed, discriminating between prior and evidence is ihmo unnecessary and misleading. Start from a condition of absolutely zero knowledge, that is to say P(H) = 0.5 and P(notH) = 0.5 This is the only 'prior' one needs, and it's trivial.

But that is not "no prior", it is "prior prob of hoax = 0.5".

We would have absolutely zero knowledge if hypothesis H was "the Hraxx is a foobar" and we have no idea of what those terms mean.  But even then, if the other party tells us that "note that, if it is not a foobar, it must be either a quxqux or a blooop", then should we set Prob(H) = 1/3?

But we do have a lot of knowledge about the issue.  Everything we know about the VMS and its history, manuscripts from that time, the world as it was then, what people knew or could have known, how forgers and their marks would probably think and act...  That is what makes each of us have a different prior Prob(H) - namely, the probability we had assigned to H before we knew the information A provided by the T&T paper.   That is what must go into Bayes's formula for Prob(H|A).
(06-07-2025, 10:35 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(06-07-2025, 05:33 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.To apply Bayes one does not need a prior probability at all. Indeed, discriminating between prior and evidence is ihmo unnecessary and misleading. Start from a condition of absolutely zero knowledge, that is to say P(H) = 0.5 and P(notH) = 0.5 This is the only 'prior' one needs, and it's trivial.

But that is not "no prior", it is "prior prob of hoax = 0.5".

We would have absolutely zero knowledge if hypothesis H was "the Hraxx is a foobar" and we have no idea of what those terms mean.  But even then, if the other party tells us that "note that, if it is not a foobar, it must be either a quxqux or a blooop", then should we set Prob(H) = 1/3?

Exactly as you said. But for simplicity it's preferabler to work with only one hypothesis. In the VMS case I'd propose H = meaningful versus notH = meaningless.

(06-07-2025, 10:35 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.But we do have a lot of knowledge about the issue.  Everything we know about the VMS and its history, manuscripts from that time, the world as it was then, what people knew or could have known, how forgers and their marks would probably think and act... 

Yes of course, but nothing of the knowledge one has, needs to go into a 'prior'. Knowledge = evidence, and it can be uniformly treated as such without separating some of that evidence into a 'prior' and calling 'evidence' only what remains. That's why saying that 'the prior is the weak spot in the Bayes formula' is not correct. Assigning probabilities (or odds, which are easier to work with) is a big problem, yes, but this is true whichever epistemological method one uses, Bayesian or not Bayesian: all inductive reasoning is probabilistic, by its very nature. So the difficulty in determining the odds (of each piece of evidence) cannot be used as a criticism of Bayesian logic: it's a problem inherent in the data at hand, not a specific feature of Bayesian reasoning.

Can Bayesian logic (which, as you may have understood, I'm quite fond of) help with the VMS? Unfortunatley I'm not sure it can, even for the most basic case, meaningless vs. meaningful. We do have a lot of knowledge, but all those pieces of evidence point sometimes toward 'meaningful', sometimes toward 'meaningless', while most of them are equally probable under the meaningful or the meaningless hypothesis, and thus are of no help. I can't see how even Bayesian logic could shed light on this problem in this situation, but at least a Bayesian analyisis would require as a pre-requisite a one-stop list of all the available evidence, which would already be a precious resource (*).

(*) not an easy task, but one could start with a list in an online page where other people can add entries, wiki-style.
(06-07-2025, 02:59 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Yes, I've specifically listed the features that make it less likely to be a hoax. It is not embellished, it is long, it has no obvious attribution to some celebrity.

It is embellished (with lots of illustrations). We may not think that they are beautiful....

The above are probably good arguments against a modern hoax, but they are not inconsistent with Sergio Toresella's suggestion: a book predending to include great knowledge, in order for a quack to appear like a great doctor.

I can just see him trying to sell medicines, a cheaper variety in simplistic containers (as on f100r) and a very expensive variety in fancy containers (as on f88r).

I am not a fan of the word 'hoax' because, beside the distinction between meaningful and meaningless, it adds an element of intention, which is not relevant for text analysis.
(06-07-2025, 10:35 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.
(06-07-2025, 05:33 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.To apply Bayes one does not need a prior probability at all. Indeed, discriminating between prior and evidence is ihmo unnecessary and misleading. Start from a condition of absolutely zero knowledge, that is to say P(H) = 0.5 and P(notH) = 0.5 This is the only 'prior' one needs, and it's trivial.

But that is not "no prior", it is "prior prob of hoax = 0.5".

I agree with Stolfi here. The a priori 'information' used in Bayesian analysis may not be meaningful or 'good'.

As mentioned in another post, it has the capability of setting one off in the wrong direction. 
That last point is of course not always the case.
(06-07-2025, 02:59 PM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.And we don't know if it's a hoax or not, so it's ? in 1.

Agreed!

Quote:1) There are reasons to make a hoax, and there are historical examples of hoaxes/forgeries
Indeed. And, just to mention in passing, there are examples of dealers "enhancing" a possibly genuine artifact with bogus signatures or stamps in order to increase its market value.  For instance:
  • "You are not allowed to view links. Register or Login to view. inscribed by Auden to Clare Boothe Luce, one titled Pre-Columbian Historical Treasures, the other an American edition of Letters from Iceland. Both signatures are forgeries"
  • "You are not allowed to view links. Register or Login to view. “A Modest and True Account of the Proceedings Against Mr. Abraham Anselm,” published in London in 1694, was reported missing from the Library, and four years later, now enhanced by some legal notes supposedly written and signed by Lincoln on one of its end papers, possibly to make it appear that it had once been part of the President’s library, it was picked up in a bookstall on 125th Street by a collector named Otto A. Hicks, who took it to Bergquist for verification of the handwriting. Bergquist was saddened to see that the end paper was clearly another bit of Coseyana.)"
Quote:While the MS is long, it's possible to find a scenario which would call for a long forgery (say, VMS should have represented the original of a foreign manuscript of roughly known size)

Yes, there have been such cases too.  The You are not allowed to view links. Register or Login to view. is a recent one.  More relevant to the VMS are perhaps the "Enochian language" books given by "Angels" to John Dee, through a crystal ball and "recorded", by his in-house scammer Edward Kelley.

The length of the VMS by itself is not a problem.  The problem is that the forger wrote 250+ pages of a complex invented language and bizarre illustrations, without adding a single element that would have made the book more attractive to the intended victim -- like some recognizable alchemical symbols, pictures of sick people being cured, intriguing weapons sketches, etc.  And he also choose to use all the vellum that he had, instead of discarding or trimming parts that were obviously bad (like f68r4, f72r4, f102r3, f112r...)

All the best, --jorge
(06-07-2025, 11:30 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.but nothing of the knowledge one has, needs to go into a 'prior'. Knowledge = evidence, and it can be uniformly treated as such without separating some of that evidence into a 'prior' and calling 'evidence' only what remains. [...] the difficulty in determining the odds (of each piece of evidence) cannot be used as a criticism of Bayesian logic: it's a problem inherent in the data at hand, not a specific feature of Bayesian reasoning.

Again, the probability of a proposition is not a measurable property of the proposition itself.  It is a numeric expression of one's belief that the proposition is true.  Thus it is inherently subjective, and depends on everything one knows about the circumstances and on all arguments and computations one can make. 

Probability Theory (PT) does not tell us how to choose probabilities.   It only tell us how to combine probabilities that we have chosen for some propositions in order to obtain probabilities for related propositions in a consistent manner.  

PT does not say that Prob(head) = 0.5 in a coin toss.  That is merely a choice that most people will make based on their knowledge (scientific or intuitive) of the mechanics of a coin toss, and/or their experience with doing them.  

PT does say that, if one's Prob(head) is 0.7, then one's Prob(tail) should be 0.3, provided one believes that those outcomes are distinct and there is no other possible outcome.  

And PT does say that Bayes's formula is the correct way to compute Prob(X_i|A_j), if one has already chosen the values of Prob(A_j|X_i) and the priors Prob(X_i).  Belief in the formula itself is not subjective; it can be proved directly from the definition of probability and the rules elementary logic.

A quantity X is called "random" if one does not know its value.  If one has chosen values for Prob(X=v) for all possible values v, then one can compute the entropy of X, which is a numeric measure (commonly expressed in bits) of one's ignorance about its value.

One may think that the entropy will only decrease as one gains more information abut X; but, paradoxically, it can increase as well.   If X  is the millionth digit of pi, my Prob(X=7) (or any other digit value ) is currently 0.1, because I cannot compute it and I have not looked it up; and my entropy for X would be ~3.3 bits.  If someone told me that he looked it up and X is actually 5, my Prob(X=7) would drop to near zero, my Prob(X=5) would rise to near 1, and my entropy of X would drop to near zero.  But if the guy then told me that he got that information from ChatGPT, my Prob(X=5) would drop back to a bit more than 0.1 and my entropy would go back to almost 3.3.

When choosing a probability, one could pretend to not know certain things; but there would be no point in doing that.  Probabilities are meant to help us take decisions, and therefore we want to use all the information and computations we have.  

When discussing probabilities, one could also consider what one's probability was before one gained certain information (like, what my Prob(H) was before I saw T&T's paper), or how one's probability would change if one received certain information (like what my Prob(H) would be if I were to learn that Barschius bought the VMS from Edward Kelley).   Or we could assume some of the other party's probabilities in order to show them what their probabilities for related events should be (like what your P(H|A) should be given your Prob(H) and Prob(A|H) etc.) 

Quote:Can Bayesian logic help with the VMS? Unfortunatley I'm not sure it can, even for the most basic case, meaningless vs. meaningful.

Yep.  My Prob(H) is obviously much smaller than yours, and it would take a lot of arguing to perhaps change either of them...

All the best, --jorge
(07-07-2025, 06:03 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The length of the VMS by itself is not a problem.  The problem is that the forger wrote 250+ pages of a complex invented language and bizarre illustrations, without adding a single element that would have made the book more attractive to the intended victim -- like some recognizable alchemical symbols, pictures of sick people being cured, intriguing weapons sketches, etc.  And he also choose to use all the vellum that he had, instead of discarding or trimming parts that were obviously bad (like f68r4, f72r4, f102r3, f112r...)

Still, it's not very hard to come up with explanations for these that don't seem very far fetched to me.

Suppose, the manuscript was intended to represent a book from a faraway land with roughly known contents and roughly known imagery. Say, it went something like this:

Lord Gully: I heard there was this famous Indian book of Great Wisdom called something like Voynibharata, I don't remember exactly. It's a family lore, my great great great grandfather saw it during the last crusade, not a big book, but hundreds of pages, with a lot of pictures of exotic plants and stars and dancing nymphs, all drawn in weird Indian style, and written in some Indian script, and it had weird foldouts like a huge map of the mystical world. It was said the book had all the secrets of the world and whoever could read it would possess the power of Thor become immortal and there was only one copy left, and it's hundreds of years old and has the signs of age. If only I could find it, it'd pay a fortune.
A good friend: Hmm, I know a guy who knows a guy who is in the middle eastern book business. I could try pulling some strings. What else do you remember about this book? Any specific images?
Lord Gully: I was told the only recognizable thing in it was the Zodiac signs...
A good friend: And about how much would you pay exactly if I had it delivered to your castle gates in a year or so?.. I would certainly make sure you get the maximum discount possible, given the book is basically priceless.

Now, the forgers know the size of the book, general idea about the images (nothing recognizable except the Zodiac) and that the book should be hundreds of years old. But there is a slight problem. For a modern person a piece of vellum is a piece of vellum. But I can imagine in the Middle Ages seeing vellum all the time, you could get from its appearance some general idea about its age and preparation. So, to forge an old book, the forgers would need old vellum and preferably from the same batch, so that nothing sticks out in terms of color, texture, etc. And to create the foldouts, there should be larger pieces. So, the forgers had to include some pretty bad pieces of vellum, because that was the only vellum they could use to make the book look realistically old.

This scenario of course would imply that the manuscript was likely created some time in the 1500s or even very early 1600s, but this is not a problem as I see it, and it would nicely explain why there is no provenance of this allegedly rare and expensive artifact recorded for 150+ years prior to it appearing in Prague. 

The above is just to show that in principle there are scenarios that would explain the size, the contents and the state of the vellum. I'm not arguing for them seriously, I just don't see how the probability of a hoax could be as low as 0.01% or even 0.1%. To me it looks like it certainly should be higher.
Anyone positing a post-1450 hoax is ignoring the evidence.
(07-07-2025, 08:54 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Anyone positing a post-1450 hoax is ignoring the evidence.

Which evidence specifically? Suppose, I was a skilled forger in the 1500s and I used authentic century old vellum in an attempt to recreate a centuries old MS. If I was in the books business, I would have seen hundreds of these old manuscripts, so I don't think any arguments related to styles, clothes, drawing techniques are relevant at all. The forger would have seen many more examples of these compared to us now. The only thing that was carbon dated is the vellum itself, as far as I know, and the vellum would be from the early 1400s.

Maybe it makes sense to split the thread, if there is going to be some non-trivial discussion about p(hoax).
Pages: 1 2 3 4 5 6 7 8 9 10