(04-10-2021, 12:35 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.I find them extra intresting though, since like Voynichese, Roman numerals tend to sort glyphs in a specific order. This is especially true if you avoid subtractive notation (so write IIII instead of IV).
That's true. If we analyze the situation with Roman numerals, the basic reason for the sorting is of course that numbers are written in order by place value. Even though Roman numerals themselves aren't a place-value notation, they were integrated into systems of counting and calculation that centered on place-values, such as finger-counting on two hands, counting-boards based on columns with different place values, and the Latin words for numbers (e.g., "
tria milia
trecenti
triginta
tres"). With additive notation, the units will be written only with V and I; the tens only with L and X; the hundreds only with D and C; with, in each case, a maximum of one token of the first glyph type followed by a maximum of four tokens of the second glyph type. With subtractive notation, we sometimes also find a single I, X, or C preceding a higher-value numeral. But the number can still be parsed in terms of descending place values, e.g., MCDXCII = M / CD / XC / II.
Subtractive forms that would violate this rule, such as ID for 499 or IIX for 8, seem always to have been nonstandard and rare, even though they can be found all the way back to Classical Antiquity. They do, however, suggest one mechanism by which glyphs that usually appear at the end of a "word" could also appear at the beginning, even before glyphs that ordinarily fill a "higher" slot.
When numbers are written in Roman numerals separately from one another -- say, in columns listing folio numbers, or inserted into texts (e.g., "xii folia") -- there's no need to disambiguate beginnings and endings. But I don't have a good sense for how situations were handled in which multiple numbers would have been written next to each other in a line. My understanding is that there wasn't a compact notation equivalent to, for example, 4+5, but I don't know whether there might have been some convention equivalent to comma-separated values, either in the era of scriptio continua or later.
If we consider a straightforward substitution cipher in which each plaintext letter is represented by an integer expressed in Roman numerals, along the lines of the "pen test" shown by J. K. Petersen You are not allowed to view links.
Register or
Login to view., it's not obvious how to proceed. According to the "pen test" key, HELLO would, I suppose, be enciphered as
xii ix xv xv xviii. Obviously the spacing here is crucial;
xiiixxvxvxviii would be illegible. The first pair of letters could be clearly separated by writing them
xij ix. Presuming final
i were always written
j, the other numbers could only be divided in the specific way they are right up to the last one:
xviii. Since final
x and
v have no distinct forms (as far as I know), this could be parsed as
x viii,
x v iii,
xviii, or
xv iii. But I notice that the "pen test" key shown by Petersen doesn't assign
iii, v, or
viii to anything, which could perhaps rule out every option but
xviii. It's also missing a few letters (V, X, Y, Z), so I'm not sure the key is complete as given, but the fact that D =
xxviii (28) and P =
xxix (29) would be consistent with certain numbers having been strategically skipped. I hesitate to spend too long analyzing this cipher arrangement without knowing its source or whether it's complete, but it looks as though whoever designed it was making a conscious effort to minimize parsing ambiguities. That's not to say there wouldn't be any ambiguity --
F and R are both assigned to xx (20), and I and T are both assigned to xxiii (23) -- but that's not a *parsing* ambiguity. Assigning x to Q was clever, since a reader could assume that (for example) xxv is A rather than QL, and that xxviij is D rather than QO, even if everything were run together. If there's more to the key than shown, all this analysis could be incorrect, of course. But the design quirks in the "pen test" cipher as shown do seem consistent with an effort to avoid parsing ambiguities.
If parsing ambiguities were recognized as a problem for such numerical ciphers, another way to eliminate them would have been to add a custom flourish after each number, e.g.,
xv'xviii' = LO, perhaps also with different flourishes for word and syllable breaks, e.g.,
xv'xviii,xxv'vii'xviii,vi'ix"xii'xviii'xv'xxviii, = LO AND BE-HOLD. Or one could switch between alternate sets of numeral glyphs. Or, as Petersen suggests, insert nulls as dividers. If there are other early examples of simple numerical substitution ciphers, and especially ones that were actually put into practice, it would be interesting to learn whether they made any provisions like these. It's hard to imagine them not being necessary.
As with so many models, this seems so close to Voynichese in some ways, and yet so far from it in others. One difference may be that an individual Roman numeral is unidirectional, starting with the higher places and moving towards the lower ones. Voynichese text is usually continuous -- broken into vords, but (except for labels) with other vords before and after them, with lots of statistical interdependencies across the vord breaks. It often seems to behave more like a punctuated loop, where glyphs have a strongly preferred sequence (e.g., qokedyqokedyqokedyqokedy), and where vord breaks help break the text into visually manageable chunks at a more or less consistent point within the loop. If Voynichese were organized in terms of a punctuated loop, then the "beginnings" and "ends" of words wouldn't be at opposite ends of anything -- they'd be the *closest* positions to one another.
(04-10-2021, 12:35 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.
None of this explains the behavior of [y] though, since apart from mostly occuring at the final position, it is also common word-initially. We cannot say that it behaves like a capital letter, because those tend to be limited to the first position. It also doesn't seem to function as a disambiguator - in words like [ytchor] and [ytaiin], what is there to distinguish?
In a punctuated loop, [y] could conceivably "sort" to the vicinity of the vord break, but sometimes falling on one side, sometimes on the other (or, in those cases where there happen to be two [y] in a row, with one on *either* side of the break). Without looking, would you guess [qokchod.ychear] or [qokchody.chear]? Or [dar.ytey] or [dary.tey]? Could it be the spacing that's inconsistent or variable, and not the role of [y] itself?