The Voynich Ninja

Full Version: Voynich Manuscript Day 2024 Schedule
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
(05-08-2024, 08:13 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.I think it would be extremely valuable to the community if someone was able to write an "explain like I'm five" version of these talks, trying to focus on the bigger picture and how Emma's, tavie's and Patrick's findings relate to each other. This might be an assignment even the authors themselves struggle with, but it would be an invaluable exercise.

I totally agree that those three talks all point to a way forward in understanding more of the language. As Rene said, one gets the feeling that there is a system beyond the chaos: a big next step could be putting those things together in some kind of formalism, making the sytem more explicit. Those talks rekindled my hope that a grammar of Voynichese can be defined.... I had great expectations for Voynich Day, but it was even better than I thought!
Congratulations!
I'm impatiently waiting for the videos to be published and I hope for some texts.
P.S. Have you heard from Merry?
(05-08-2024, 08:13 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.I felt myself losing the forest through the trees, wondering how things fit into the bigger picture. I believe that people who delve deep into Voynichese textual statistics have developed a better feel for this, which people like me are lacking

I dare say that, over the years, I have been able to develop a reasonable feel for the statistics of the text of the MS, but there was so much new information, and quite specific and detailed information at that, that also I could not absorb it all in time. It is quite overwhelming, and having the videos might make it possible to gradually get a better picture. I think it would be even better if the slides could be shared...
(05-08-2024, 08:13 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.I think it would be extremely valuable to the community if someone was able to write an "explain like I'm five" version of these talks, trying to focus on the bigger picture and how Emma's, tavie's and Patrick's findings relate to each other. This might be an assignment even the authors themselves struggle with, but it would be an invaluable exercise.

In my opinion, it should be noted that the talks use different reference systems to measure their results. Patrick Feaster (see You are not allowed to view links. Register or Login to view.) is using a repetitive default loop "qokeedyqokeedyqokeedyqokeedy" as reference system and finds variations of the default loop like "qokedy" instead of "qokeedy" interesting. Emma May Smith (see You are not allowed to view links. Register or Login to view.) on the other hand is using a pure random distribution (see You are not allowed to view links. Register or Login to view.) as reference system and finds therefore repeated elements across words like the "y.q" in "qokeedy.qokeedy" interesting. In short, they measure the same thing by focusing on different viewpoints. Patrick points out that it is possible to find some randomness within the repetitive Voynich text, whereas Emma highlights that the text is less random than rolling dice. (A simpler example would be the sequence "qy qy qy qy qe qy qy qy qy qy." One person might point out that a pattern of repeated "qy" exists, while another might highlight that the single instance of "qe" breaks the pattern and is therefore the most interesting part of the sequence or possibly a mistake.)

I prefer to compare the Voynich text with natural languages. From this perspective, I argue that the Voynich text is very different from natural languages, as it is both more repetitive and contains more variation. (see You are not allowed to view links. Register or Login to view.).

Side note to illustrate how important the reference system is: The paper of Montemurro et al from 2013 (You are not allowed to view links. Register or Login to view.) is based on the idea that "uninformative words tend to have an approximately homogeneous (Poissonian) distribution" and "the most relevant words are scattered more irregularly, and their occurrences are typically clustered". But they didn't check if words with a homogeneous (Poissonian) distribution exist. Than in 2021 a paper of Claire Bowern assumes that "Montemurro et al. (2013) use techniques from information theory to identify which words are most likely to contribute to topics in texts. That is, they identify words that are more uniformly distributed throughout the Voynich Manuscript and compare them with those that tend to cluster." (see You are not allowed to view links. Register or Login to view.). Bowern also didn't check for uniformly distributed words. However since uniformly distributed words doesn't exist within the Voynich text [see Timm & Schinner 2019, p. 6] Montemurro et al. and also Claire Bowern wrongly assume that it is possible to distinguish between uniformly distributed words and clustered or topic words. Based on this assumption they wrongly conclude that a relation between topics indicated by illustrations and the distribution of words exists. See also the review blog post of the linguist Chris Chrisomalis You are not allowed to view links. Register or Login to view.

There are different levels of variation within the Voynich text. It is well known that Currier A behaves differently than Currier B. Currier A prefers words like daiin, chol, and chor, while on pages in Currier B words like chedy, qokeedy, daiin, and qokaiin are frequent. However also between different quires in Currier A or quires in Currier B relevant differences exists. For instance words like cheol/sheol are more common in the pharmaceutical section than in Herbal A. For Currier B words containing the sequence "ed" are far more common in the biological section than in Herbal B. If we look even on pages no obvious rule can be deduced which words form the top-frequency tokens at a specific location, since a token dominating one page might be rare or missing on the next one (see Timm & Schinner, 2020). Currier also points out that even "the frequency counts of the beginnings and endings of lines are markedly different from the counts of the same characters internally" (see You are not allowed to view links. Register or Login to view.). It would therefore be interesting to check in future research how certain of the presented statistics behave for different sections.
I think the poster above fails to understand that the comparison used depends on what the researcher is seeking to discover.
  • Answering a question like "What kind of text is this?" requires comparison to other texts.
  • Answering a question like "How does this text work?" requires comparison of the text with itself.

These are both worthwhile questions and it's foolish to criticize research which explicitly looks at one for not using the methods of the other. For example, when a researcher starts her presentation with the words, "Much research which analyses the text of the Voynich manuscript focusses on its general properties, but I want to look at smaller patterns which exist between words and glyphs.", it seems odd to reply, "Well, I prefer to compare the Voynich text with natural languages."

But, just to humour the poster above, let me state that I did compare my results with that of a natural language. In Latin, the "feature patterns" which the test picked up on were things such as prepositions and the case of the following word. I didn't include that in the presentation because it would be highly misleading as to the nature of those relationships in the Voynich manuscript. But it's clear that important characteristics of a linguistic text can be identified with the test.

Really, the key thing which much of the presented research lacks---and I'm 100% guilty of this---is good analysis. We don't always understand what the patterns and exceptions mean, how they fit together, and what broader rules and processes in play must be. Some suggestions, such as tavie's vertical avoidance rule, are incredibly interesting, but still hard to understand their reasoning and import. But I'm glad that we keep finding new things, or even just new perspectives on things, as we'll never know what small piece of data is important. We're doing well to avoid wild flights of fantasy, building hypotheses on slender grounds, and clinging onto them for a decade without self-awareness that we might be wrong.
(04-08-2024, 03:41 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.The slides for my talks are attached.

(Sorry, I couldn't answer any questions.)
page 9 of You are not allowed to view links. Register or Login to view.
Our example pattern: y.q 
● A word ending [y] followed immediately by a word starting with [q]. 
● Definitions: ● Feature 1 is [y.] ● Feature 2 is [.q]
Pages 11 and 12 are for countings and likelyhood.

I would like to see the same analisys for Feature 2 [.q]  and another gliphs for Feature 1, 
in particular for the most common ending vords: l m r s ...
And then compare all the graphics of the distributions of counting and likelyhood of every pair of Features.
I would like to check if [.q] has a similar behavior with different vords endings.
(05-08-2024, 08:27 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.As Rene said, one gets the feeling that there is a system beyond the chaos


“Though this be madness, yet there is method in't.”  -- Shakespeare, Hamlet
Dear Emma,

 the goal of my post was to point out that that both analyses are correct since you and Patrick viewed the text from different angles. There was no harm intended. By the way your analyses of the dependency between glyphs aligns with the You are not allowed to view links. Register or Login to view. by Vladimir Sazonov from 2003.
(05-08-2024, 12:20 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.I argue that the Voynich text is very different from natural languages, as it is both more repetitive and contains more variation.
There is simple explanation for this: there is less variety in word families, but higher frequency of basic word family words, such as EVA daiin, dy, dal, dar. On the other extreme, there is higher frequency of words that occur only once or twice, which can be attributed to inflectional and word forming roots, different prefixes, and misspelled words. The high repetition can be explained as improper transcription (the Latin u-like glyph is missing in EVA, so that keedy could be transliterated as keedy or kudy, cheedy or chudy.
(06-08-2024, 12:48 AM)cvetkakocj@rogers.com Wrote: You are not allowed to view links. Register or Login to view.which can be attributed to inflectional and word forming roots, different prefixes, and misspelled words

"Inflectional and word forming roots, different prefixes" also occur in other languages, whose texts have very different preperties than Voynichese.

This would leave misspelled words as the main reason, and that is extremely unlikely.

(It could also be tested by modelling random or targeted errors in a known plain text. I've been thinking about that, but there are always more interesting things to do).
Pages: 1 2 3 4