The Voynich Ninja

Pages: 1 2 3 4 5 6 7

Hi Koen, thank you for proposing this interesting subject! In your first post you presented a few different ideas. Here are some considerations, but as always, take everything I write with a grain of salt, it's well possible I made errors, miscounted things etc.

The idea that i-sequences are minims seems indeed to be the default. Final 'n' as part of an i-sequence is also widely accepted. For instance You are not allowed to view links. Register or Login to view. by Currier, Bennett (first study of character entropy), Glen Caston (v101), Zandbergen (CUVA) all map the EVA sequences 'in' and 'iin' into individual characters. Basically, the exception is EVA; as Rene says: "Eva is not attempting to identify semantic units in the text. It simply represents in an electronic form the shapes that are seen in the MS. It is left to a later step by analysts to decide which combinations should be seen as units."

For e-sequences, the situation is less unanimous, but v101 and CUVA do map some cases to individual characters.

In my opinion, an interesting side of the subject is that words that only differ for 'iin' vs 'in' seem to behave quite similarly. E.g. here I have collected the 5 most frequent words occurring immediately adjacent to qokaiin and qokain, and those occurring in Shakespeare's works immediately adjacent to 'am' 'an'.

[attachment=9161]

Even if numbers in the Voynich text are small, the overlap between the two sets in Voynichese is very close (while there is no overlap for English). Also note "qokeedy qokaiin" vs "qokedy qokain", but all four possible combinations occur:

qokedy qokain 6
qokedy qokaiin 2
qokeedy qokain 5
qokeedy qokaiin 5

About the idea of 'qo' as 'yo'. To quote You are not allowed to view links. Register or Login to view.: the "circle" letters a o y [...] are usually inserted between the other letters, as in qokeedy or okedalor. The insertion is strongly context-dependent, of course. As several people have observed, two circles in consecutive positions occur with abnormaly low frequency - much less than implied by the frequencies of individual letters.

The way in which circles and non-circles alternate is one of the basic patterns in the rigid word-structure of Voynichese. The total number of circles in paragraph text is ~50,000. Consecutive circles occur about 500 times (1%), vs ~5000 occurrences of 'qo'. In my opinion, we should be careful before we trade the basic principle of circle/non-circle alternation in favour of a very frequent "yo" bigram in disguise.

Thanks to RadioFM for linking Emma's post about You are not allowed to view links. Register or Login to view.. In my opinion, that is a good example of evidence suggesting that some form of positional variation could be at play, even if it may not be apparent without a deeper analysis.

Hi Marco, thanks for your interesting reply. I agree that "iin" and "in" behave similarly, and for all we know they may be equivalent. It could be that the idea was just "and now there's a bunch of minims". But it could also be that the number of minims is important by itself, even though different numbers occur in similar contexts.

I was unaware of Stolfi's observation about round characters alternating with others. Yesterday, while I was looking at these things, it did seem to me almost as if "o" is used as a spacer between other characters or clusters. I had not yet come to the more complete insight.

I have no strong feelings about what EVA-q is. Stolfi's quote could be seen as an argument both against it and in favor of positional variation. It might be that the underlying "things" of EVA "a, y, o" are really averse of each other, and we cannot expect them or their variants together. But it might also be that their collision is exactly what triggers the use of a variant. In that case, they simply don't collide because they take a different shape when that happens. But again, I don't feel strongly about any of this.

Yes, everything is tricky, for a change. Stolfi aimed at describing things as they appear on the page. The observation that Voynichese has a rigid structure lends itself to be formally described as a grammar and that is what Stolfi very successfully did.

One can formally describe rewrite rules to take Voynichese closer to a hypothetical "underlying structure" which the writing system doesn't always represent directly. E.g. Emma did this with her hypothesis of You are not allowed to view links. Register or Login to view..
Emma's post tries to make sense of [o] and [a/y] as corresponding to two entities with very similar roles in word-structure. Though there may be other explanations, an obvious parallel is phonetics and how vowels are largely interchangeable. Since Voynichese looks close to a pronounceable phonetic system (partly because of the circle/non-circle alternation), the needed adjustments are small. In this case, rewriting is largely driven by apparent structure, but the goal is not simply describing what is on the page, but reconstructing an "original".

A further step is making the hypothesis that structure suggests obfuscation, as with your hypothesis that circles do not repeat because such repetitions are hidden by a one-to-many (homophonic) cipher with specific rules based on context (if I understand correctly). In that case, inferring the underlying system becomes very hard, since one cannot rely on apparent structure (which largely results from rewriting, not from underlying structure). Verbose ciphers and/or nulls could probably fit into this kind of rewriting.

The final step is assuming that all the structure we see is entirely due to rewrite rules, with possibly no "underlying" system or meaning (Timm and Schinner).

One problem we may be facing is that we have not yet been able to establish a concrete, viable alternative to the extremes.

Simple(ish) substitution, as espoused by most theorists, is suitable for the historical context but not for the system of Voynichese. And nonsense generation (Timm, Rugg...) can be shown to suit the system, but is historically unattested at this scale. To my feeling, a wide chasm looms between these extremes, and we don't know very well what's in there. For example, we could try assuming positional variation to convert Voynichese to something like Roman numerals, but then what? As long as there is nothing to aim for, we cannot test any hypotheses.

That's a great (if depressing) summary of the state of the art.

I am afraid that the best compromise between "historically plausible" and "compatible with the text" is some form of automatic writing, as discussed by Gaskell and Bowern in their Malta paper You are not allowed to view links. Register or Login to view..

The other options are both poor fits for the text (e.g. how to explain Patrick Feaster's and Tavi's recent findings?) and horribly difficult to investigate or just formulate, because the number of variables and unknowns is huge.

(06-09-2024, 03:30 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Whenever I look at the nitty-gritty of Voynichese text, I can't help but feel like some glyphs should be (positional) variants of each other.

* final glyphs have a flourish: Eva- s, n
* In a series of i-minims, the last one looks like EVA-n
* In a series of c-shapes, the last one looks like EVA-s
* EVA-q could be the preferred shape of EVA-y before o

There are also other things, like if you have a series of minims that is not preceded by [o], the first one looks like EVA-a. And I'm sure there's much more like this.
I'm also sure all of this has been remarked many times before. What I don't understand is that some of these features (like [n] being just another minim) have not been adopted as the default approach yet...

I agree with the idea that positional variants do exist.

The following grid is part of the larger grid or network of related Voynich words (see network_grid.txt). In my view, it supports the idea that certain glyphs are interchangeable depending on their position. The core of the grid is built around similar word types, such as what I refer to as the "aiin-series," with words ending in [ain], [aiin], and [aiiin], and the "ol-series," featuring words ending in [ol], [or], [om], [al], [ar], and [am]. (Note: There are also some "hybrid" word types between these two series ending in [air], [ail], and [an].)

In my view, the grid supports the idea that certain glyphs are more likely to interchange in specific positions, while being less likely to do so in others. For example, [o] can interchange with [y] both at the beginning and end of a word. However, words starting with [o-] are much more common than those starting with [y-]. Similarly, while words ending in [o] do exist, there are significantly more words that end in [y] than in [o]. Additionally, [o] can interchange with [a] before [l] and [r], but not before glyphs like [s], [d], [k], or [t].

In my view, it is reasonable to propose certain positional rules as working hypotheses. For example, word-final [-o] and [-a] may become [-y], or word-initial [d-] and [ch-] may transform into a gallow glyph when preceded by prefixes like [o-], [y], or [qo]. This would suggest that [qokal], without the final [l], would transform into [qoky] rather than [qoka], and [chedal] would similarly change to [chedy], and not [cheda]. Likewise, adding a prefix [o-] to [daiin] would more likely result in [okaiin] rather than [odaiin]. (Note: There are always some rare word types that break these "rules". It is for instance possible to find words with [a] before [t] like You are not allowed to view links. Register or Login to view. or words ending in [a] like You are not allowed to view links. Register or Login to view.. Such exceptions typically do not follow the Curve-Line System, as a glyph ending with a line stroke, such as [a], should be followed by a glyph starting with a line stroke like [r], [n], [i], or [l]. (see the You are not allowed to view links. Register or Login to view. or "Die Harmonie der Glyphenfolgen" by You are not allowed to view links. Register or Login to view.)).

Code:
            d-           od-         qod-         qok-          ok-          yk-         k-         t-         yt-         ot-          qot-

aiiin ( 41) daiiin ( 17) odaiiin( 4) qodaiiin( 2) qokaiiin(  2) okaiiin(  4) ykaiiin( 1) kaiiin( 3) taiiin( 1) ytaiiin(--) otaiiin(  1) qotaiiin( 2) 

aiin  (469) daiin  (863) odaiin (60) qodaiin (42) qokaiin (262) okaiin (212) ykaiin (45) kaiin (65) taiin (42) ytaiin (43) otaiin (154) qotaiin (79)

ain   ( 89) dain   (211) odain  (18) qodain  (11) qokain  (279) okain  (144) ykain  (10) kain  (48) tain  (16) ytain  (13) otain  ( 96) qotain  (64) 

an    (  7) dan    ( 20) odan   ( 2) qodan   ( 2) qokan   (  8) okan   (  5) ykan   ( 1) kan   ( 3) tan   ( 1) ytan   (--) otan   (  5) qotan   ( 2)                        

air   ( 74) dair   (106) odair  ( 5) qodair  ( 3) qokair  ( 17) okair  ( 22) ykair  ( 8) kair  (14) tair  (13) ytair  ( 3) otair  ( 21) qotair  ( 6) 

ar    (350) dar    (318) odar   (24) qodar   ( 3) qokar   (152) okar   (129) ykar   (36) kar   (52) tar   (43) ytar   (26) otar   (141) qotar   (63)

ail   (  5) dail   (  2) odail  (--) qodail  (--) qokail  (  1) okail  (  1) ykail  ( 1) kail  ( 1) tail  (--) ytail  (--) otail  (  1) qotail  ( 1) 

al    (260) dal    (253) odal   (13) qodal   ( 7) qokal   (191) okal   (138) ykal   (16) kal   (23) tal   (20) ytal   (19) otal   (143) qotal   (59) 

am    ( 88) dam    ( 98) odam   ( 6) qodam   ( 3) qokam   ( 25) okam   ( 26) ykam   ( 5) kam   ( 9) tam   (--) ytam   (13) otam   ( 47) qotam   (12) 

os    ( 29) dos    (  1) odos   (--) qodos   ( 1) qokos   (  1) okos   (  8) ykos   (--) kos   ( 3) tos   ( 4) ytos   ( 1) otos   (  4) qotos   ( 1)      

or    (363) dor    ( 73) odor   ( 8) qodor   ( 2) qokor   ( 36) okor   ( 34) ykor   (10) kor   (26) tor   (23) ytor   (14) otor   ( 46) qotor   (29)            

ol    (537) dol    (117) odol   ( 2) qodol   ( 1) qokol   (104) okol   ( 82) ykol   (14) kol   (37) tol   (48) ytol   (12) otol   ( 86) qotol   (47)  

o     ( 81) do     ( 16) odo    (--) qodo    (--) qoko    (  9) oko    (  8) yko    (--) ko    ( 2) to    ( 2) yto    ( 5) oto    (  9) qoto    ( 3) 

y     (151) dy     (270) ody    (46) qody    (17) qoky    (147) oky    (102) yky    (18) ky    (25) ty    (16) yty    (24) oty    (115) qoty    (87) 

ey    (  1) dey    (  1) odey   ( 1) qodey   (--) qokey   (107) okey   ( 63) ykey   ( 8) key   (14) tey   (11) ytey   (13) otey   ( 57) qotey   (24) 

eey   (  3) deey   (  7) odeey  ( 2) qodeey  ( 2) qokeey  (308) okeey  (177) ykeey  (58) keey  (44) teey  (20) yteey  (28) oteey  (140) qoteey  (42)

eeey  (  1) deeey  (  1) odeeey ( 2) qodeeey (--) qokeeey ( 26) okeeey ( 27) ykeeey ( 6) keeey (11) teeey ( 1) yteeey ( 3) oteeey (  8) qoteeey ( 4)

chey  (311) dchey  ( 18) odchey ( 1) qodchey (--) qokchey ( 30) okchey ( 32) ykchey ( 6) kchey (21) tchey (22) ytchey (12) otchey ( 31) qotchey (19) 

chy   (155) dchy   ( 39) odchy  ( 1) qodchy  ( 2) qokchy  ( 69) okchy  ( 39) ykchy  (22) kchy  (29) tchy  (24) ytchy  (19) otchy  ( 48) qotchy  (63) 

shy   (104) dshy   (  8) odshy  (--) qodshy  (--) qokshy  ( 10) okshy  ( 10) ykshy  ( 2) kshy  ( 5) tshy  ( 5) ytshy  ( 3) otshy  (  4) qokshy  (10)

(09-09-2024, 10:11 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.For example, we could try assuming positional variation to convert Voynichese to something like Roman numerals, but then what?

Codebook indices would be one way to use numbers.

And once you have numbers you can subject them to mathematical operations where the resultant number does have meaning, like in Greek Isopsephy or Hebrew Gematria.

i recently cooked up a method to convert plaintext and eva to numbers, then convert between them. -- didnt work well though, entropy too high, words too long.

A codebook scenario could work, since TTR stats are a bit more normal. In a 1 to 1 code though, we would still expect to see common phrases pop up (like "nomen herba est" or something like that). If I recall correctly, those are not obviously present.

(09-09-2024, 09:00 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.A codebook scenario could work, since TTR stats are a bit more normal. In a 1 to 1 code though, we would still expect to see common phrases pop up (like "nomen herba est" or something like that). If I recall correctly, those are not obviously present.

Another challenge for the codebook theory is explaining the shift from Currier A to Currier B. To account for this change, one would have to assume the use of different codebooks for different sections of the text. Additionally, since word types typical of Currier A also appear in Currier B, but not vice versa, it would require continuously adding new code words. Furthermore, in natural languages, frequent words—such as function words like conjunctions and articles—are evenly distributed throughout a text. This would necessitate using multiple code words for each function word. But what reason would someone in the Middle Ages have to adopt such a laborious method for encoding a whole book?

If we assume that the text contains a hidden message, the steganography hypothesis would provide a more plausible explanation. In theory, any text could be used to conceal a hidden message. For example, one could track the frequency of specific glyphs within a section of text, such as counting the number of curved glyphs between two gallows or tallying the occurrences of a glyph like EVA-[i] within a line. However, the main drawback of this approach is that, even in an entire book of text, it would only be possible to hide a relatively short message.

I feel hesitant about steganography because it would require two unattested scenarios to be true: large scale nonsense text generation and a novel encoding system embedded therein. Although I guess if whoever made this was capable of one, they'd also be capable of the other.

Some of the problems you point out could be solved by glyph variants. For example if only the minim matters, not what type of extra part is attached to it. (I don't know if this particular change would solve anything, it's just a made up example). This would turn the burden of multiple codes into a matter of esthetic scribal preference.

Pages: 1 2 3 4 5 6 7

MarcoP

Koen G

MarcoP

Koen G

MarcoP

Torsten

RobGea

Koen G

Torsten

Koen G