The Voynich Ninja

Full Version: A new Timm & Schinner publication regarding the Malta conference
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
(02-08-2023, 08:02 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I would be interested to know if Timm and Schinner may have submitted a paper to the conference and been rejected.

If my memory is correct then I think you are right in your supposition.
(03-08-2023, 09:37 AM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.
(02-08-2023, 08:02 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I would be interested to know if Timm and Schinner may have submitted a paper to the conference and been rejected.

If my memory is correct then I think you are right in your supposition.
Wrong: You are not allowed to view links. Register or Login to view.
(03-08-2023, 11:57 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(03-08-2023, 09:37 AM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.
(02-08-2023, 08:02 PM)bi3mw Wrote: You are not allowed to view links. Register or Login to view.I would be interested to know if Timm and Schinner may have submitted a paper to the conference and been rejected.

If my memory is correct then I think you are right in your supposition.
Wrong: You are not allowed to view links. Register or Login to view.

Well, clearly I haven't remembered this incident quite right.
It feels a little weird commenting on Sections 1 & 2 because while I agree wholeheartedly with the proposition that the application of the type of topic analysis discussed to the Voyn Mss text is (probably/almost certainly) not useful -- on reading the journal version of You are not allowed to view links. Register or Login to view. I wrote in my notes "It would be interesting to apply this technique to the Voynich Mss if I didn't think the results would be meaningless" -- I do so for different reasons that to some extent are contradictory to Timm & Schinner's:

1) Such papers rely -- as does Timm & Schinner's use of their token network analysis to argue against the existence of a meaningful text in the mss(*)  -- on the questionable assumption that space-delimited tokens in the mss text correspond to words in whatever underlying text may exist, and

2) Such papers assume that the Currier dialects do not represent qualitatively different samples that can't blithely be thrown into the same analytic blender and produce meaningful results. This is actually one of the big points of divergence between Timm & Schinner's critique and my views, in that they argue that there is not a qualitative difference between the Currier dialects while I view the qualitative difference as real and large enough to invalidate just chucking all the pages into one's topic analysis method of choice.

(*) Having said that, it's important to acknowledge that theories in which there is an underlying meaningful text that is transformed in such a way that spaces in the mss don't correspond to word breaks in the original text need to show how those theories explain the token network analysis results. The point I'm making is that if the "words" in the mss. aren't words in the underyling text, then the fact that Timm & Schinner's token network analysis shows that "words" in the mss. don't behave like words in a natural language is irrelevant because the "words" in the mss. aren't words in a natural language.

Notes on Section 2 "Linguistic structures":

"The existence versus nonexistence of structures in the VMS that are characteristic for linguistically meaningful text (and that cannot be explained as by-product of an algorithmic construction process) is perhaps the most important key question." -- I am in vehement agreement with this, with the caveat that I'd very much like to see work that focuses on metrics that characterize "meaningful" text which are agnostic with regard to the existence/position of spaces.

"Meanwhile it has been shown by Timm and Schinner (2020) that both of Zipf’s laws can emerge as necessary by-products of an intuitive pseudo-text algorithm." -- I am also in vehement agreement that there are ways of generating pseudo-vocabularies that show Zipf's Law-like behaviors, with the caveat that those ways don't necessarily involve algorithmically generated or otherwise non-meaningful text (e.g., "Indeed, the theoretical challenge raised by this model can be illustrated by taking a corpus of text  and dividing it on a character other than the space (' ') character, treating, for instance, 'e' as a word boundary.^^14 <#FN14> Doing this robustly recovers a near-Zipfian distribution over these artificial 'words,' as shown in Fig. 9." [You are not allowed to view links. Register or Login to view.).

So far, so good, but at this point my views start to diverge and my problems with the paper begin...

"Indeed, the existence of two statistically strictly separated sub-texts, Currier A and B, would provide some evidence for an underlying meaningful text, either as two dialects, topics, or different encryption/encoding schemes." -- I vigorously disagree with this statement, ironically because I can envision the A/B split being the result of something like Rugg's grill method with different tables under the grill rather than the result of a meaningful text -- or at the very least, I'm fairly confident that would be Rugg's take on the matter.

In Section 2.1, "The Currier languages," Timm & Schinner present an analysis in which they measures the similarity between pairs of pages in the manuscript by computing the normalized dot product of the "word" frequency vectors (where each word in the vocabulary has a corresponding frequency count in the vector) -- this corresponds to the cosine of the angle between the pages' vectors in word frequency space. Having shown the absence of a sharp break point in the values of this page similarity metric, the authors conclude  "This behavior confirms the hypothesis of a continuous evolution from Currier A to B,...The answer to the question of whether a particular folio  would belong to Currier A is not a definitive yes or no, but rather a  percentage number." -- Sorry, but no. This claim fails on three fronts:

1) It is fatally methodologically flawed to do any sort of cluster analysis on 102 data points ("The VMS consists of 102 folios, from which 5151 pairs can be selected (excluding redundancy by symmetry, and the case of two identical folios)") in a 7000-dimensional space ("The VMS vocabulary consists of about 7000 words. Let ~v be a vector with each component representing the token frequency of one of these words."). This is due to what is known as "the curse of dimensionality" -- indeed, as one web page describing the issue (You are not allowed to view links. Register or Login to view.) points out "Too many dimensions cause every observation in your dataset to appear equidistant from all the others" (my emphasis; while that quote is talking about Euclidean distance, if T&S want to claim that the same problem doesn't exist for vector cosines I would genuinely like to see a reference and will metaphorically eat my hat on this point if they can provide one). In fact, given the inappropriately high dimensionality of their feature space the very fact that the Bio and Recipes sections stand out in Figure 2 the way they do -- that there is any kind of signal at all in the noise, that there are any kind of distinctive clusters at all in their data -- strikes me as compelling evidence against Timm & Schinner's conclusion. That doesn't mean there aren't methodologically sound experiments one could design to test whether the Currier dialects are or aren't separable in vocabulary space, it just means this isn't one of them...Frankly, it's somewhat surprising to me that this got past the reviewers for Cryptologia. I'm always willing to entertain the possibility that maybe I'm the crazy one here -- does anyone who's read the paper and knows their stuff when it comes to cluster analysis/pattern classification want to defend T&S's methodology here?

2) The correct conclusion from what is shown in Section 2.1 is not that whether a particular folio belongs to Currier A is fuzzy, the correct conclusion is that token frequency vector cosine similarity is a poor way of making that classification decision. The Currier A dialect and the Currier B dialects (plural -- the one used in some of the herbal folios, and the one used in what Currier referred to as the "Biological" folios) are absolutely separable on the basis of nothing more than letter pair frequency statistics -- in fact A pages can be separated from B pages with 100% accuracy using nothing more that the relative frequency on a page of EVA <ed>, and that frequency does not smoothly evolve between the A and B pages. The on-page frequency of <ed> (or, more precisely "C8" since I work in the Currier transcription alphabet) ranges from 0.00% to 0.51% for Herbal A pages; it ranges from 1.34% to 9.05% for Herbal B pages and 2.58% to 8.90% for Bio B pages. That's a 0.8% gap between the Herbal A page with most frequent use of <ed> and the Herbal/Bio B page with the least frequent use of <ed> (or put differently, the page in Herbal B that uses it least still uses it more than twice as often as the Herbal A page that uses it most). While the ranges for Herbal B and Bio B overlap, that isn't because those pages aren't separable using digram frequencies, it's because there isn't one single digram that is diagnostic of the split the way there is for the A/B split.

3) In discussing more sophisticated methods of topic analysis in Section 2.2, they acknowledge the importance of removing common "function words" from texts for topic modeling to generate meaningful results: "All topic modeling approaches need a pre-processing step that removes function words from the input data base, because they (a) usually are the most frequent tokens, and (b) carry no contextual information. Too many such words would otherwise bury the rather sensitive clustering algorithm under an intolerable amount of noise, eventually rendering it useless." While their discussion of the difficulties of doing this with the Voynich text in Section 2.2 are spot-on, their lack of any attempt to do so in their analysis in Section 2.1 poisons their conclusion here. The fact that T&S don't believe any of the "words" are, in fact, function words is irrelevant to this objection to their conclusion. 

Having said all that, I think there very much is a valid critique of the body of work applying topic modeling techniques to the Voynich text that is based in the Currier dialects, it's just not the one T&S make. Consider not just the marked differences in digram frequencies between A and B pages, but also the differences in most-common words (in Currier, not EVA -- following my mother's advice, just because all my friends jumped off a particular bridge doesn't require me to follow suit...):

Herbal A: 8AM, SOE, SOR, 89, S9, 2, ZOE, Q9, 8AN, ZO
Herbal B: 8AM, SC89, OR, AR, AM, 8AR, 89, S89, 4OFC89, ZC89
Bio B: ZC89, SC89, OE, 4OFC89, 4OFCC89, 4OFAN, 4OFAE, 4OE, 4OFCC9, 4OFAM

*If* you're going to throw Herbal A, Herbal B, and Bio B pages into the same analytic blender on the assumption that the difference between them merely reflect differences in content topic, you have the burden of proof (if you think this is a natural language) to show some example (preferably more than one) of a language and pair of topics that shows

* the same quantitative level of difference in basic letter and letter pair frequency stats that is seen between A & B pages

* the same lack of overlap in common vocabulary words (Bowern & Lindermann point this out as well: "While there is some overlap, the most common vocabulary items of Voynich A and Voynich B are substantially different. While the words in both languages are built from the same three-field structure, they do not clearly correspond to each other. They might be the result of different encoding processes, or they might represent different underlying natural languages." You are not allowed to view links. Register or Login to view.)

On top of that, you also need to explain the marked differences between Herbal A and Herbal B despite the apparent commonality of topic based on illustration type (single large plant drawing).

Section 2.2, "Topic modeling" -- as should be clear from what I've said above, I mostly have an "a pox on both their houses" reaction to this.

Section 3 briefly critiques a number of papers from the Malta conference:

3.1, "Crux of the MATTR": Timm & Schinner say, "His conclusion 'The profile of Voynich A suggests that t is more morphologically complex than Voynich B, which may indicate that it encodes a separate language or dialect' (which later on is used as implicit argument against the gibberish interpretation) is based on two assumptions: 1) separable 'linguistic domains' Currier A and B, and 2) the position of a text sample in the MATTR/MCW plane characterizes its morphology. Assumption (1) definitely is wrong, see our analysis in Section 2.1, as well as in Timm and Schinner (2020)." See the critique of Section 2.1 above.

3.2 "Voynich paleography": Lisa Fagin Davis is more than capable of defending her own work, doesn't need me to do it for her, and will hopefully do so here once she has read T&S's paper.

Of the other papers T&S discuss in section 3, Zattera's (Section 3.6, "Evidence of word structure") is the only one I've read in sufficient detail to evaluate their critique. I'm inclined to agree that the appearance of a "word" grammar or grammars, while seductive, is a mirage (although, again, for different reasons than T&S). Having said that, I'm puzzled by the objection that "Furthermore, it is difficult for the word grammar approach to really explain the characteristic relationship between similarity, spatial vicinity, and token frequency" given that explaining those things is generally not what people looking for a grammar for Voynichese word morphology are trying to do. 

In focusing as hard as they do on arguing that their model is better, T&S miss a significant problem with Zattera's paper. While I'm sympathetic to Zaterra's concern about over-generalization by proposed Voynichese word grammars, his approach to addressing the issue treats any word generated by a proposed grammar but not in the actual text as a false positive. Looking at how the number of new word types grows as a function of the first N lines of Bio B (say) make very clear that the finite sample of the Voynich "language(s)" we have make that an incredibly sketchy assumption. There is a body of literature in the area of induction of regular grammars that addresses how to deal with the over generalization problem given only positive training examples (i.e., we don't really have a list of words that aren't in the vocabulary of the text), and the limitations that imposes on what you can do.


Section 3.8, "Gibberish after all?" -- T&S state "...later on in their paper Gaskell and Bowern state: 'A more significant limitation of this work is that, because of the short length of our text samples, we are unable to test whether gibberish can replicate the larger structural features (such as “topic words”) which have been observed in the VMS (Montemurro and Zanette 2013; Reddy and Knight 2011; Sterneck, Polish, and Bowern 2021). At present, these features pose a serious challenge to proponents of the hoax hypothesis.' While they do not explicitly explain (or give examples) for the term 'topic words' in this context, we presume that it refers to 'topic modeling', or previous attempts to
associate some Voynichese words with particular illustrations and/or Currier languages." 

The concept of "topic words" appears to be a standard one in document analysis work (see, for instance, Shin & Zhang, "Extracting Topic Words and Clustering Documents by Probabilistic Graphical Models" You are not allowed to view links. Register or Login to view.). T&S are half-right, in that it is a term of art used in the topic modeling literature (see, for instance, Alokaili et al, You are not allowed to view links. Register or Login to view.). A conference paper is not a tutorial, and Gaskell & Bowern (or any other authors) are not responsible for T&S (or any other readers) being too lazy to spend 5 minutes with a search engine.
(04-08-2023, 12:42 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.3.2 "Voynich paleography": Lisa Fagin Davis is more than capable of defending her own work, doesn't need me to do it for her, and will hopefully do so here once she has read T&S's paper.
Certainly, Lisa Fagin Davis can defend her own work. If you want to get an overview of what Timm's critique might look like without access to the actual paper, you should read here:

You are not allowed to view links. Register or Login to view.
(04-08-2023, 12:42 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.1) It is fatally methodologically flawed to do any sort of cluster analysis on 102 data points ("The VMS consists of 102 folios, from which 5151 pairs can be selected (excluding redundancy by symmetry, and the case of two identical folios)") in a 7000-dimensional space ("The VMS vocabulary consists of about 7000 words. Let ~v be a vector with each component representing the token frequency of one of these words."). This is due to what is known as "the curse of dimensionality" -- indeed, as one web page describing the issue (You are not allowed to view links. Register or Login to view.) points out "Too many dimensions cause every observation in your dataset to appear equidistant from all the others" (my emphasis; while that quote is talking about Euclidean distance, if T&S want to claim that the same problem doesn't exist for vector cosines I would genuinely like to see a reference and will metaphorically eat my hat on this point if they can provide one). 

Thank you! I couldn't agree more.

(04-08-2023, 12:42 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.In fact, given the inappropriately high dimensionality of their feature space the very fact that the Bio and Recipes sections stand out in Figure 2 the way they do -- that there is any kind of signal at all in the noise, that there are any kind of distinctive clusters at all in their data -- strikes me as compelling evidence against Timm & Schinner's conclusion.

This is partly a side effect of the second major issue: the data points are the folios, and these two groups of folios stand out by having far more words than the others. This is where statistics can just barely begin to work.

Cosines are very similar to correlation. Both are the scalar product of two one-dimensional vectors, just normalised in a different way.
Happy to respond to Torsten's critique.

1) Paleography is, as I have said on multiple occasions, primarily subjective and is based on a human eye interpreting work produced by other humans. Conclusions result from detailed observation combined with experience and expertise. I have studied thousands of early manuscripts in multiple languages in dozens of collections over the last thirty years; published and taught extensively in the field; and am an elected member of the International Committee for Latin Paleography (You are not allowed to view links. Register or Login to view.), one of only four members from the United States. So I hope that my experience and expertise speak for themselves.

2) Along with anyone else, Torsten is more than welcome to disagree with me in accordance with his own observations, as no one should accept my conclusions without critical analysis. He doesn’t see what I see, which is fine. I leave it to our readers to decide for themselves whose observations they trust.

3) However, elsewhere in his critique of my work, Torsten focuses on the wrong part of my analysis. He has noted on several occasions that he thinks it is a major oversight and methodological flaw that I didn't record which leaves were uploaded to Archetype/DigiPal for initial analysis. In fact, that piece of information (which I can't access anymore anyway) is irrelevant and shows that he has fundamentally misunderstood the point of the Archetype model. Archetype is NOT an automated tool for analyzing visual samples (You are not allowed to view links. Register or Login to view.). It is a visual tool designed to help humans annotate and classify letterforms written by other humans. It does not matter which pages I started with – the choice was fairly random and was just a way to help me get started. Eventually, once I knew which letterforms were going to be diagnostic for identifying scribes in the manuscript (in particular EVA [k], [n], and, to a lesser extent, [f]), I examined every page manually before assigning scribes to each. If Archetype had in fact been an automated system that triggered an AI analysis, the choice of leaves would have been extremely important and of course I would have explained in detail which leaves I chose and why. This critique is, frankly, irrelevant as it has no impact on the results of my work.

4) I understand why it is important to Torsten that my identification of five scribes (or more than one, even) be wrong, as his own conclusions become less likely if there are multiple hands writing Voynichese in internally-consistent ways. His arguments that the manuscript is gibberish and that the A/B distinction is non-existent are directly contradicted by my own work. If there are five scribes, four of whom use Voynichese in very similar ways (2, 3, 4, and 5 writing “Language” B, albeit with scribe 4 showing some important variants), it becomes more difficult to make a convincing case that the manuscript is nonsense and that the A/B distinction is an illusion. I am certain that he will continue to refute me in print and on this site, but I do not plan to rebut his arguments further.

5) Finally, I have issues with the argument that he and others have made about the gradual transition from A to B. While there are clearly issues with making clear distinctions between A and B (and as I am not a linguist or a cryptologist I cannot critique his methodology), that analysis is based on the demonstrably false assumption that the leaves and sections are currently in their original order. They almost certainly aren’t. If we knew the original sequence of leaves (something I am putting a lot of work and thought into at the moment from the perspective of codicological evidence), such an analysis would be welcome.
(04-08-2023, 12:42 PM)kckluge Wrote: You are not allowed to view links. Register or Login to view.1) It is fatally methodologically flawed to do any sort of cluster analysis on 102 data points ...

Concerning the "failure on three fronts" of our "cosine distance" analysis:

1. The "curse of dimensionality" does exist, but in many cases it is a blessing rather than a curse. Several successful methods of statistical physics would not work in less than billion-dimensional space. Two vectors always define a plane, regardless of dimensionality, which, coarsely speaking, "focuses" the statistics; a situation different from simple data distribution. Unfortunately, we were not able to find any website dealing with advanced mathematical statistics of this kind in a more than superficial way, so your hat may remain uneaten. But perhaps you might consider this argument: If you were right, then how could topic modeling programs work at all? Basically, they use the same principle. And they indeed are marvelously successful tools – just not for analyzing the VMS (because there are no topics to analyze - see chapter 2 "Context-dependent self-similarity" in Timm & Schinner 2020, p. 3ff). 

Thus:
a) Topic modeling works (despite the "curse of dimensionality").
b) In the VMS it cannot even correctly identify the two Currier clusters.
c) The non-existence of separated sections (and topics) requires an explanation.


2. We also were using ed-statistics in our 2020 paper (see Timm & Schinner 2020, p. 6). However, when considering not only the Herbal A+B and Bio sections, but rather all sections, then a different picture (without dramatic jumps!) emerges for the frequencies of ed-tokens:
 
Code:
SECTION    <ed>-COUNT PERCENTAGE  TOKEN-COUNT
Herbal (A)       12      0.15%       8,087
Pharma (A)       17      0.67%       2,529
Astro            28      1.31%       2,136
Cosmo           257      9.55%       2,691
Herbal (B)      528     16.33%       3,233
Recipes (B)    2073     19.42%      10,673
Bio (B)        1925     27.85%       6,911

See also the discussion by Zandbergen: "… the overall statistics demonstrate that there is a continuum, and the other (not herbal) pages actually 'bridge the gap'." (Zandbergen You are not allowed to view links. Register or Login to view.)


3. From our perspective, there are indeed no function words present in the VMS, "since a token dominating one page might be rare or missing on the next one. However, all pages containing at least some lines of text do have in common that pairs of frequently used words with high mutual similarity appear" (Timm & Schinner, 2020, p. 3). As a result, we conducted our cosine distance analysis based on our preliminary findings, and the results align with our initial assumption. What is unscientific about this? While the method is open to discussion, would you have raised concerns if our Figure 1 had indicated a change in slope?


Finally, regarding the "topic word" remark we choose to overlook the overt sarcasm. Believe it or not, we are aware of this concept in topic modeling, however we were asking about this term in the context of the VMS. Reddy and Knight just evaluate whether the words are randomly distributed: "That is, do certain words 'burst' with a high frequency within a page, or are words randomly distributed across the manuscript?" (Reddy & Knight p. 6). And Montemurro & Zanette write: "While uninformative words tend to have an approximately homogeneous (Poissonian) distribution, the most relevant words are scattered more irregularly, and their occurrences are typically clustered" (Montemurro & Zanette, p. 2). There are no uniformly distributed tokens in the VMS. So, according to this, all words in the VMS must be "topic words." Thus, we insist that we are not just "half-right", but at least "three-quarter-right".

Note, preprints of our papers are available at You are not allowed to view links. Register or Login to view.
Ich frage mich, wie will man etwas beurteilen, wenn man die Sprache nicht kennt. Auch wenn man die Sprache kennt, sie aber nicht versteht. Des weiteren versteht man die Symbole nicht und man stützt sich auf eine Fragwürdige Transkriptionen.
So haben Wörter auch eine Mehrfachfunktion.
Beispiel:
«mer händ» bedeutet nicht mehr Hände sondern «wir haben»
«mer hät» bedeutet «man hat»
«häsch mer en Öpfel» (hast Du mir einen Apfel)
«s’mues mer milch ine»  (es muss mehr Milch rein). Varianten: mer/me/meh/mehr.
Mit einem Gramatikfehler wird auch das «mer ist schön blau» (Meer).
 
Ich kann das einem nicht deutsch sprechenden nicht richtig erklären.
Aber wir schreiben «kein» mit «ai» «kain» weil man kein «e» hören kann.
Das gleiche gilt für «eure» da kein «e» aber «oi» für «eu» kann man es als «oiri» schreiben. Aus dem folg « mini,dini,sini, iri, oiri» (meine,deine,seine ihre, eure, usw).
Da kommen noch die Varianten der Dialekte dazu  wie, Rücken/Rugge(n) oder Rukch(e)n.
Daher gilt, Ohr vor Auge. Man muss den Text hören können. Vergiss Duden und Langenscheidt.
Bei den Beispielen habe ich die Lesehilfe entfernt. Verstehst Du es auch ohne?
Vielleicht bedeutet der Text ja doch etwas. Denk mal darüber nach.


I wonder how you can judge something if you don't know the language. Even if you know the language, but you don't understand it. Furthermore, you don't understand the symbols and you rely on a questionable transcription.
So words also have a multiple function.
Example:
"mer händ" no longer means hands but "we have".
"mer hät" means "one has
"häsch mer en Öpfel" (have you got me an apple)
"s'mues mer milch ine" (more milk must go in). Variants: mer/me/meh/mehr.
With a grammatical error, "mer ist schön blau" (sea) is also used.

I can't explain this properly to a non-German speaker.
But we write "kein" with "ai" "kain" because you can't hear an "e".
The same goes for "eure" because there is no "e" but "oi" for "eu" you can write it as "oiri". From this follows " mini,dini,sini, iri, oiri" (mine,yours,his hers, yours, etc).
Then there are the dialect variants such as Rücken/Rugge(n) or Rukch(e)n.
Therefore, ear before eye. You have to be able to hear the text. Forget Duden and Langenscheidt.
I have removed the reading aid from the examples. Can you understand it without it?
Maybe the text does mean something. Think about it.

[attachment=7526][attachment=7527]
(06-08-2023, 04:34 PM)LisaFaginDavis Wrote: You are not allowed to view links. Register or Login to view.Happy to respond to Torsten's critique.

Happy to respond to Lisa's statement:
You are not allowed to view links. Register or Login to view.
Pages: 1 2 3 4 5