pfeaster > 08-08-2024, 12:55 PM
(05-08-2024, 08:13 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.All research was well presented, for example Emma explained statistical concepts well.... But as the data was being presented in the text-focused talks, I felt myself losing the forest through the trees, wondering how things fit into the bigger picture.... I think it would be extremely valuable to the community if someone was able to write an "explain like I'm five" version of these talks, trying to focus on the bigger picture and how Emma's, tavie's and Patrick's findings relate to each other. This might be an assignment even the authors themselves struggle with, but it would be an invaluable exercise.I'm sure I won't be able to do justice to this assignment on my own, but it's interesting enough that I didn't want to leave it unaddressed.
Torsten > 08-08-2024, 09:53 PM
pfeaster > 09-08-2024, 12:54 PM
(08-08-2024, 09:53 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.In my eyes the properties you describe as loops are caused by the network of similar vords. [see Timm & Schinner 2020, p. 4]. The general principle within this network of similar words is that "high-frequency tokens also tend to have high numbers of similar words. ... words (i.e. unconnected nodes in the graph) usually appear just once in the entire VMS while the most frequent token <daiin> (836 occurrences) has 36 counterparts with edit distance 1. [Timm & Schinner 2020, p. 6]. For this reason the most likely paths in your transitional probability matrix must result in frequently used words.
The cause for the network of similar words is "an existing deep correlation between frequency, similarity, and spatial vicinity of tokens within the VMS text" [Timm & Schinner 2020, p. 4] Or with other words "all pages containing at least some lines of text do have in common that pairs of frequently used words with high mutual similarity appear." [Timm & Schinner 2020, p. 3].
(08-08-2024, 09:53 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.One of your questions was why the repetition counts for [ol] are lower then the counts for [qokeedy] and for [chol]. Regarding this question it is interesting to look on folio 15v. On You are not allowed to view links. Register or Login to view. not only “oror” exists, there is also “oror or” and “or or oro r” immediately above each other on the first two lines. There are five instances for [oror], seven instances of [arar], even 15 instance for [olol], and 2 instances for [dydydy]. This means for shorter vords like [or], [ol], [dy] it is also possible to combine two instances of a word like [dy] into a new word like [dydy] or [dydydy].These are all impressive cases of repeating / looping. And yes, if we ignore spacing, it looks like there are well over 100 tokens of the glyph sequence [olol], which is clearly more than the 32 tokens of the glyph sequence [cholchol], based in both cases on a very hasty count. My observation was limited to repetitions of whole discrete words, which I'll admit is a distinction I don't make for the rest of my analysis.
(08-08-2024, 09:53 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.Another observation is that for longer words most transitions result in similar word that are recognizable. For instance is the similarity between [qokedy], [qokeedy] and [okeedy] still obvious. For shorter words the transition of a glyph automatically replaces a larger part of the word. If for instance [ol] transforms into [al], [or], or [kol] 50 or 33 % of the glyphs are different. Therefore sequences like [or.ar.y.kar.ol.al] on f34r.P.15 or [tor.ol.dol.or] on f54r.P.10 are maybe less eye catching compared to sequences like [qokeedy.qokeedy.chey.qokeedy.qokedy] on f108.P.37.
(08-08-2024, 09:53 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.Did that mean the it is necessary to assume different default loops for different pages?
Torsten > 09-08-2024, 03:36 PM
(09-08-2024, 12:54 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.I definitely agree that there's a close connection between the patterns I was describing and the network you (and Schinner) have written about. Still, I want to be careful when drawing conclusions about what's causing what. You write here that the so-called "loop" properties are caused by the network (first paragraph), but then that the network is in turn caused by a "deep correlation between frequency, similarity, and spatial vicinity of tokens" (second paragraph). Since I'd consider "deep correlation between frequency, similarity, and spatial vicinity of tokens" to be a reasonably good description of the system I was describing, those two paragraphs strike me as forming a little loop of their own. So what came first, the chicken or the egg?
(18-06-1975, 11:47 AM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.On the one hand, we have a network of similar words whose frequency correlates remarkably well with their degree of similarity to a few specific models.
On the other hand, we have a set of generative rules (involving glyph-by-glyph transitional probabilities) that would produce approximately that same set of words with approximately the same frequencies. Other models (e.g., some "word paradigms") may be able to do the same.
pfeaster > 09-08-2024, 06:16 PM
(09-08-2024, 03:36 PM)Torsten Wrote: You are not allowed to view links. Register or Login to view.To construct the glyph-by-glyph transitional probability tables, it is necessary to count the frequency with which certain glyphs follow one another. These tables are thus derived directly from the text. However, given that the Voynich text exhibits variation, it would be essential to use different transitional probabilities for different pages/sections of the manuscript. If one assumes that a device was used to generate the text, then multiple devices would be required to account for these variations.
ReneZ > 10-08-2024, 02:27 AM
(09-08-2024, 12:54 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.I definitely agree that there's a close connection between the patterns I was describing and the network you (and Schinner) have written about. Still, I want to be careful when drawing conclusions about what's causing what. You write here that the so-called "loop" properties are caused by the network (first paragraph), but then that the network is in turn caused by a "deep correlation between frequency, similarity, and spatial vicinity of tokens" (second paragraph). Since I'd consider "deep correlation between frequency, similarity, and spatial vicinity of tokens" to be a reasonably good description of the system I was describing, those two paragraphs strike me as forming a little loop of their own. So what came first, the chicken or the egg?
On the one hand, we have a network of similar words whose frequency correlates remarkably well with their degree of similarity to a few specific models.
On the other hand, we have a set of generative rules (involving glyph-by-glyph transitional probabilities) that would produce approximately that same set of words with approximately the same frequencies. Other models (e.g., some "word paradigms") may be able to do the same.
If I understand things correctly, the self-citation hypothesis holds that the network exists due to the dynamics of copying words with minor changes, guided by the subjective / aesthetic preferences of the writer (and hence non-random). Still, if we could identify a set of more concrete and specific rules, and had to account for its relationship with a network of words that just happened to follow those rules, I'd think Occam's razor would point to the rules causing the network rather than the network causing the rules.
Torsten > 10-08-2024, 03:26 AM
(09-08-2024, 06:16 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Or a device that's responsive to input, or a device that builds cumulatively on its previous output.
(09-08-2024, 06:16 PM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.Could you list a few of the most deviant pages or bifolios that your work has shown to have the most locally distinctive word forms? We could then take a look at the matrices for them and see how -- and how far -- they differ from the norm.
Emma May Smith > 10-08-2024, 08:46 PM
pfeaster > 11-08-2024, 01:51 AM
(10-08-2024, 08:46 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.I have a couple of questions about spaces, and wonder if you have any thoughts?
1. You mention the difference between Strong and Weak breakpoints, but I wonder if all breakpoints tend toward 100% in the right conditions? I mean, this should be obviously so, but I think that those conditions might be quite complex in some cases. Break points aren't strictly predictable for some parts of the pattern, which is a striking contrast to the near 100% predictability of some transitions.
(10-08-2024, 08:46 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.2. Given this, are spaces part of the same system as the loops or something which interweaves with it?
I'm curious as sometimes the differences in token counts for words with small changes is used as evidence for the underlying system, but those are dependent on the presence of spaces. That is, [chol] and [daiin] and [choldaiin] are all different word types, but parts of the same loops.
ReneZ > 11-08-2024, 01:52 AM