![]() |
|
Uncertain spaces as evidence of verbose glyph pairs - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: Uncertain spaces as evidence of verbose glyph pairs (/thread-5196.html) |
Uncertain spaces as evidence of verbose glyph pairs - kckluge - 01-01-2026 Anyone who's been around the Ninja (or Voynich Mss. related discussions in general) is familiar with the existence of certain glyph pairs with unusually high frequencies which are significant contributors to the low conditional second-order entropy of the text. To pick an obvious example, here are the 10 most frequent glyphs following Currier 'O'/EVA 'o' in running text lines in ZL_ivtff_1b.txt converted to Currier: kgram: OF OE OP OR O8 OB O2 OC OA OX (EVA): ok ol ot or od op os oe oa ockh Rank: 1 2 3 4 5 6 7 8 9 10 Count: 5620 5242 3244 2531 2035 519 382 322 257 179 REFreq: 0.2609 0.2434 0.1506 0.1175 0.0945 0.0241 0.0177 0.0150 0.0119 0.0083 RECmFrq: 1.0000 0.7391 0.4957 0.3451 0.2276 0.1331 0.1090 0.0912 0.0763 0.0644 Note the steep drop from OR (11.75%) and O8 (9.45%) to OB (2.41%) and O2 (1.77%). If certain glyph pairs go together as a unit, uncertain spaces before and/or after may reflect an unconscious hesitation on the part of the scribe. Here are the 20 most frequent glyph pairs with an uncertain space after them: kgram: OE, AR, OR, AE, CO, 89, C9, SO, AM, AT, (EVA): ol, ar, or, al, eo, dy, ey, cho, aiin, air, Rank: 1 2 3 4 5 6 7 8 9 10 Count: 387 207 181 165 121 101 96 58 55 51 REFreq: 0.2043 0.1093 0.0956 0.0871 0.0639 0.0533 0.0507 0.0306 0.0290 0.0269 RECmFrq: 1.0000 0.7957 0.6864 0.5908 0.5037 0.4398 0.3865 0.3358 0.3052 0.2761 kgram: S9, AN, 4O, P9, O2, F9, C8, ZO, OP, C2, (EVA): chy, ain, qo, ty, os, ky, ed, sho, ot, es, Rank: 11 12 13 14 15 16 17 18 19 20 Count: 34 25 24 20 20 19 19 18 14 14 REFreq: 0.0180 0.0132 0.0127 0.0106 0.0106 0.0100 0.0100 0.0095 0.0074 0.0074 RECmFrq: 0.2492 0.2313 0.2181 0.2054 0.1948 0.1843 0.1742 0.1642 0.1547 0.1473 ...and here are the 20 most frequent glyph pairs with an uncertain space before them: kgram: ,SC ,AM ,ZC ,FC ,8A ,AE ,FA ,OE ,AR ,89 Rank: 1 2 3 4 5 6 7 8 9 10 Count: 195 162 149 126 124 101 97 86 82 70 AllFreq: 0.0011 0.0009 0.0008 0.0007 0.0007 0.0006 0.0005 0.0005 0.0005 0.0004 REFreq: 0.0910 0.0756 0.0695 0.0588 0.0579 0.0471 0.0453 0.0401 0.0383 0.0327 RECmFrq: 1.0000 0.9090 0.8334 0.7639 0.7051 0.6472 0.6001 0.5548 0.5147 0.4764 kgram: ,SO ,4O ,FS ,OR ,AN ,PA ,AT ,PC ,EF ,AJ Rank: 11 12 13 14 15 16 17 18 19 20 Count: 55 54 45 42 39 29 28 27 26 26 AllFreq: 0.0003 0.0003 0.0002 0.0002 0.0002 0.0002 0.0002 0.0001 0.0001 0.0001 REFreq: 0.0257 0.0252 0.0210 0.0196 0.0182 0.0135 0.0131 0.0126 0.0121 0.0121 RECmFrq: 0.4438 0.4181 0.3929 0.3719 0.3523 0.3341 0.3206 0.3075 0.2949 0.2828 None of those counts are huge given the total number of glyph pairs in the running text, but there is at least a weak signal with regard to some of the most obvious candidates like OE, OR, AE, AR, and the various word-end specific A<x> combos like AM, AN, AT, AJ. Of course, the above results need to be taken with an appropriate grain of salt given disagreements between transcribers regarding whether something is a clear or uncertain space, or whether there is an uncertain space in a given position at all. Nevertheless, thought it was worth throwing out there as something to think about. Happy New Year to all readers & posters on the Ninja, and best wishes for a happy & healthy 2026. |