14-12-2021, 10:17 PM
A few months ago I spent a while looking at first-order transitional probabilities in the VM -- e.g., if all we know is that the current glyph is [d], what's the likelihood of the next glyph being [o]? I You are not allowed to view links. Register or Login to view. about this at the time, but I'd like to summarize one set of observations here because they've continued to tug at my curiosity and I'm not sure how to explore them any further.
Transitional probabilities are radically different for Currier A and Currier B -- so much so that there seems to be no value in working them out for the VM as a whole.
So let's start with Currier B, and also with the glyph [d]. What's the most probable sequence of glyphs to follow from that, ignoring spaces, as if we were using an extremely crude auto-complete algorithm?
[d] --> [y] (63.71%)
[y] --> [q] (28.11%)
[q] --> [o] (97.69%)
[o] --> [k] (30.31%)
[k] --> [e] group (41.86%); probability of that [e] group then being specifically [ee] (54.28%)
[ee] --> [d] (39.79%)
In other words, the most probable path forward turns out to be a closed loop: [qokeedyqokeedyqokeedy....]. Of course, this resembles a common repetitive pattern we actually see in Currier B. A single choice of alternate transition would typically lead to a familiar-looking "path" such as these:
[qokeey.qokeedy]
[qokaiin.okeedy]
[qokeedy.chedy]
[qolkeedy.qokeedy]
[qotedy.qokeedy]
If we try the same thing in Currier A, again starting with [d], we get:
[d] --> [a] (50.40%)
[a] --> [i] group (51.96%); probability of that [i] group then being specifically [ii] (75.52%)
[ii] --> [n] (94.80%)
[n] --> [ch] (21.25%)
[ch] --> [o] (45.67%)
[o] --> [l] (24.99%)
[l] --> [d] (21.92%)
Hence, a different closed loop: [daiincholdaiincholdaiincholdaiin.....]. But this time even some of the most probable transitions are still less than 22% probable: [n] --> [ch] and [l] --> [d]. And if we examine line-position statistics, the most probable transitions actually vary from point to point, so that there seem to be separate, non-overlapping [chol] and [daiin] regions. Thus, a more nuanced analysis might predict [cholcholchol...daiindaiindaiin] over the course of a line, or maybe something even more varied.
Torsten Timm You are not allowed to view links. Register or Login to view. three vord "series" -- a [daiin] series, an [ol] series including [chol], and a [chedy] series including [qokeedy] -- and finds that vords tend to be more common the more closely they resemble [daiin], [ol], or [chedy]. But within each Timm series, the vord most often found repeating identically is the specific one corresponding to the inferred loop sequence:
[daiin.daiin] ×13
[chol.chol] ×23
[qokeedy.qokeedy] ×19
All of which has led me to wonder whether Voynichese might default to some sort of looping pattern whenever there's minimal "signal" present, analogous to an unmodulated carrier signal. But I can't think of a good way to move from that vague notion to any more concrete kind of experiment, and I also worry that there's some circular logic in here somewhere. I don't *think* the commonness of specific vords such as [qokeedy] could itself be responsible for the patterns these vords seem best to exemplify -- but if it were, I suppose that would be one way to discount this line of speculation.
I'll also admit that first-order transitional probabilities don't have very good predictive power. I tried using them as a basis for generating random text and came up with this for Currier B (with spaces inserted wherever two adjacent glyphs most often have one):
qol.dy.dor.ol.Shey.or.Shokaiin.Shotalkar.chedy.Shy.chopcholkedy.Sheokeey.s.chcKhdy.chy.okeor.
odytey.odytodain.SheotShey.pchdy.keedy.dal.Shdy.Shetaiin.ol.ol.ody.dytchedy.qol.ol.Shekeedain.
Shedy.qol.l.chedy.dytar.olal.dy.qotey.qosal.cheokaiin.y.otchokchokeedain.cheey.y.pcholkar.Shar.
cheeotchedy.keedar.ain.cheeyty.Sheey.ol.chcKhoteey.l.dy.kaiin
That's not very good pseudo-Voynichese. Note in particular the frequent vords containing multiple gallows.
But if we advance to second-order transitional probabilities, the [qokeedy] and [chol/daiin] loops persist, and the results of generating random text start to feel a little more plausible (to me, at least):
ol.qokeodar.ar.okaiin.Shkchedy.Shdal.qotam.ytol.dal.cheokeedy.chkal.Shedy.qokair.odain.al.ol.daiin.
cheal.qokeeey.lkain.chcPhedy.kchdy.cheey.otar.cheor.aiin.Shedy.dal.dochey.opchol.okchy.Sheoar.ol.
oeey.otcheol.dy.chShy.lkar.ain.okchedy.l.chkedy.oteedar.ShecKhey.okaiin.chor.olteodar.okal.qokeShedy.
ol.ol.Sheey.kain.cheky.chey.chol.chedy
With these randomly generated text examples, I don't mean to imply I favor a stochastic-process solution -- I'm just trying to illustrate how well or poorly a relatively simple transitional-probability model fits what we're used to Voynichese looking like. These examples aren't based on any "word structure" model as such. They also ignore all line-position patterning.
Apologies as always for any and all reinvented wheels.
Transitional probabilities are radically different for Currier A and Currier B -- so much so that there seems to be no value in working them out for the VM as a whole.
So let's start with Currier B, and also with the glyph [d]. What's the most probable sequence of glyphs to follow from that, ignoring spaces, as if we were using an extremely crude auto-complete algorithm?
[d] --> [y] (63.71%)
[y] --> [q] (28.11%)
[q] --> [o] (97.69%)
[o] --> [k] (30.31%)
[k] --> [e] group (41.86%); probability of that [e] group then being specifically [ee] (54.28%)
[ee] --> [d] (39.79%)
In other words, the most probable path forward turns out to be a closed loop: [qokeedyqokeedyqokeedy....]. Of course, this resembles a common repetitive pattern we actually see in Currier B. A single choice of alternate transition would typically lead to a familiar-looking "path" such as these:
[qokeey.qokeedy]
[qokaiin.okeedy]
[qokeedy.chedy]
[qolkeedy.qokeedy]
[qotedy.qokeedy]
If we try the same thing in Currier A, again starting with [d], we get:
[d] --> [a] (50.40%)
[a] --> [i] group (51.96%); probability of that [i] group then being specifically [ii] (75.52%)
[ii] --> [n] (94.80%)
[n] --> [ch] (21.25%)
[ch] --> [o] (45.67%)
[o] --> [l] (24.99%)
[l] --> [d] (21.92%)
Hence, a different closed loop: [daiincholdaiincholdaiincholdaiin.....]. But this time even some of the most probable transitions are still less than 22% probable: [n] --> [ch] and [l] --> [d]. And if we examine line-position statistics, the most probable transitions actually vary from point to point, so that there seem to be separate, non-overlapping [chol] and [daiin] regions. Thus, a more nuanced analysis might predict [cholcholchol...daiindaiindaiin] over the course of a line, or maybe something even more varied.
Torsten Timm You are not allowed to view links. Register or Login to view. three vord "series" -- a [daiin] series, an [ol] series including [chol], and a [chedy] series including [qokeedy] -- and finds that vords tend to be more common the more closely they resemble [daiin], [ol], or [chedy]. But within each Timm series, the vord most often found repeating identically is the specific one corresponding to the inferred loop sequence:
[daiin.daiin] ×13
[chol.chol] ×23
[qokeedy.qokeedy] ×19
All of which has led me to wonder whether Voynichese might default to some sort of looping pattern whenever there's minimal "signal" present, analogous to an unmodulated carrier signal. But I can't think of a good way to move from that vague notion to any more concrete kind of experiment, and I also worry that there's some circular logic in here somewhere. I don't *think* the commonness of specific vords such as [qokeedy] could itself be responsible for the patterns these vords seem best to exemplify -- but if it were, I suppose that would be one way to discount this line of speculation.
I'll also admit that first-order transitional probabilities don't have very good predictive power. I tried using them as a basis for generating random text and came up with this for Currier B (with spaces inserted wherever two adjacent glyphs most often have one):
qol.dy.dor.ol.Shey.or.Shokaiin.Shotalkar.chedy.Shy.chopcholkedy.Sheokeey.s.chcKhdy.chy.okeor.
odytey.odytodain.SheotShey.pchdy.keedy.dal.Shdy.Shetaiin.ol.ol.ody.dytchedy.qol.ol.Shekeedain.
Shedy.qol.l.chedy.dytar.olal.dy.qotey.qosal.cheokaiin.y.otchokchokeedain.cheey.y.pcholkar.Shar.
cheeotchedy.keedar.ain.cheeyty.Sheey.ol.chcKhoteey.l.dy.kaiin
That's not very good pseudo-Voynichese. Note in particular the frequent vords containing multiple gallows.
But if we advance to second-order transitional probabilities, the [qokeedy] and [chol/daiin] loops persist, and the results of generating random text start to feel a little more plausible (to me, at least):
ol.qokeodar.ar.okaiin.Shkchedy.Shdal.qotam.ytol.dal.cheokeedy.chkal.Shedy.qokair.odain.al.ol.daiin.
cheal.qokeeey.lkain.chcPhedy.kchdy.cheey.otar.cheor.aiin.Shedy.dal.dochey.opchol.okchy.Sheoar.ol.
oeey.otcheol.dy.chShy.lkar.ain.okchedy.l.chkedy.oteedar.ShecKhey.okaiin.chor.olteodar.okal.qokeShedy.
ol.ol.Sheey.kain.cheky.chey.chol.chedy
With these randomly generated text examples, I don't mean to imply I favor a stochastic-process solution -- I'm just trying to illustrate how well or poorly a relatively simple transitional-probability model fits what we're used to Voynichese looking like. These examples aren't based on any "word structure" model as such. They also ignore all line-position patterning.
Apologies as always for any and all reinvented wheels.