07-07-2025, 09:34 AM
PS. I think I can express a bit more clearly now why I am not impressed by T&T's claims.
In their argument T&T implicitly or explicitly assume that Prob(A|not H) is practically zero; that is, they assume that a manuscript that is not a hoax cannot have the "context-dependent repetitions" that they observed -- because they did not observe them in a few other non-hoax books that they analyzed. Conversely they claim that Prob(A|H) is much higher, because the hypothetical forger may well have generated the VMS using a method, like the SCM, that accidentally created such repetitions.
And indeed, if Prob(A|H) is much greater than prob(A|not H), then Bayes's formula gives Prob(H|A) ≈ 1 --- no matter what the prior P(H) is.
However, my Prob(A|not H) is actually quite high. If the nature of the text is what the illustrations suggest (herbal, pharmacopoeia, list of diseases, etc.), then I do expect that it will have a lot more "context-dependent repetitions" than a novel or chronicle.
And conversely my Prob(A|H) is rather low, because I cannot see how or why the forger would have used a generation method that produced a text with the observed "natural" properties of the VMS (Zipf's law, vocabulary size, word entropy, etc.) but with a word structure quite unlike that of an European language --- plus those "context-dependent repetitions".
I don't see the SCM as a plausible answer to that question. The "self-citation" part is relatively easy to execute, but does not seem to be a natural choice for the hypothetical forger, and would require a non-trivial "warm-up" period to create a stable seed text that could then be used to start the VMS. But the "mutation" part of the SCM would require generating several coin tosses, with non-uniform probabilities, at each word. And these probabilities would have to be finely tuned in order to generate the proper Zipf plot and other "natural" properties.
I would expect that a forger who set out to create an "alien" book of lore would use a simpler method, without caring for staistics or consistency -- like the "method" (or lack thereof) that Edward Kelley used to create the You are not allowed to view links. Register or Login to view. books. if that crude product could fool a mathematician like Dee, it would surely fool whoever was the intended VMS victim.
But then, if Prob(A|H) ≈ Prob(A|not H), then Bayes's formula says that P(H|A) ≈ P(H). That is, ones prior probability of the VMS being a hoax is not significantly changed by learning that it has "context-dependent repetitions".
An, in fact, if Prob(A|H) is less than Prob(A|not H), learning of observation A actually lowers one's probability that the VMS is a hoax.
All the best, --jorge
In their argument T&T implicitly or explicitly assume that Prob(A|not H) is practically zero; that is, they assume that a manuscript that is not a hoax cannot have the "context-dependent repetitions" that they observed -- because they did not observe them in a few other non-hoax books that they analyzed. Conversely they claim that Prob(A|H) is much higher, because the hypothetical forger may well have generated the VMS using a method, like the SCM, that accidentally created such repetitions.
And indeed, if Prob(A|H) is much greater than prob(A|not H), then Bayes's formula gives Prob(H|A) ≈ 1 --- no matter what the prior P(H) is.
However, my Prob(A|not H) is actually quite high. If the nature of the text is what the illustrations suggest (herbal, pharmacopoeia, list of diseases, etc.), then I do expect that it will have a lot more "context-dependent repetitions" than a novel or chronicle.
And conversely my Prob(A|H) is rather low, because I cannot see how or why the forger would have used a generation method that produced a text with the observed "natural" properties of the VMS (Zipf's law, vocabulary size, word entropy, etc.) but with a word structure quite unlike that of an European language --- plus those "context-dependent repetitions".
I don't see the SCM as a plausible answer to that question. The "self-citation" part is relatively easy to execute, but does not seem to be a natural choice for the hypothetical forger, and would require a non-trivial "warm-up" period to create a stable seed text that could then be used to start the VMS. But the "mutation" part of the SCM would require generating several coin tosses, with non-uniform probabilities, at each word. And these probabilities would have to be finely tuned in order to generate the proper Zipf plot and other "natural" properties.
I would expect that a forger who set out to create an "alien" book of lore would use a simpler method, without caring for staistics or consistency -- like the "method" (or lack thereof) that Edward Kelley used to create the You are not allowed to view links. Register or Login to view. books. if that crude product could fool a mathematician like Dee, it would surely fool whoever was the intended VMS victim.
But then, if Prob(A|H) ≈ Prob(A|not H), then Bayes's formula says that P(H|A) ≈ P(H). That is, ones prior probability of the VMS being a hoax is not significantly changed by learning that it has "context-dependent repetitions".
An, in fact, if Prob(A|H) is less than Prob(A|not H), learning of observation A actually lowers one's probability that the VMS is a hoax.
All the best, --jorge