[split] Verbose cipher? - Printable Version

[split] Verbose cipher? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: [split] Verbose cipher? (/thread-3356.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13

RE: [split] Verbose cipher? - davidjackson - 24-06-2022

The trouble with such stenographic shorthand is that it always develops through trial and error.
If you look at the history of such schemes -and there are a lot of them - it always starts off as something difficult and quickly evolves as proponents develop the system.
Now, in the Voynich, it seems there is just the one system that is fairly homogenous throughout.
So if there is a "system" then it's fairly well developed. Which, especially if there are several scribes, suggests a well developed scheme that has a well worked out back story. Problematic, especially when you think that if this is true, it's centuries out in front of its rivals. And, there was no need for such a scheme to be developed. Such schemes didn't start to get going for several centuries more, for the simple reason that society didn't need them.

RE: [split] Verbose cipher? - pfeaster - 25-06-2022

(24-06-2022, 08:47 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.The trouble with such stenographic shorthand is that it always develops through trial and error.
If you look at the history of such schemes -and there are a lot of them - it always starts off as something difficult and quickly evolves as proponents develop the system.
Now, in the Voynich, it seems there is just the one system that is fairly homogenous throughout.
So if there is a "system" then it's fairly well developed. Which, especially if there are several scribes, suggests a well developed scheme that has a well worked out back story. Problematic, especially when you think that if this is true, it's centuries out in front of its rivals. And, there was no need for such a scheme to be developed. Such schemes didn't start to get going for several centuries more, for the simple reason that society didn't need them.

Stenotype shows some influence from older systems of manual shorthand, but the distinctive features I mentioned that remind me of Voynichese were new to it and would have been counterproductive before the use of mechanical typewriters. In manual shorthand, the goal was to reduce writing to the fewest number of pen-strokes consistent with legibility, so that writing initial G as TKPW or final J as PBLG would have been decidedly a step in the wrong direction.

By likening Voynichese to Stenotype, I don't mean to suggest that Voynichese was also stenographic. I'm just struck by apparent similarities in the structure of the writing itself. Stenotype seems to be weird in some of the same ways Voynichese is weird.

RE: [split] Verbose cipher? - MichelleL11 - 28-06-2022

(25-06-2022, 04:35 AM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.By likening Voynichese to Stenotype, I don't mean to suggest that Voynichese was also stenographic. I'm just struck by apparent similarities in the structure of the writing itself. Stenotype seems to be weird in some of the same ways Voynichese is weird.

Hi, Patrick:

Thank you for bringing the Stenotype example to the board's attention. I agree with you that there are aspects about it that immediately appear parallel, such as the method of potentially "marking" the beginning of all paragraphs in a way that echos a "real" letter but is, in all probability, decidedly not one. That is close to what is going on with the Voynich text, but I'm sure you agree not quite . . .

There is also the use of groups of letters (some much longer than bigrams) in a way that appears to represent, perhaps, only a single letter of the plaintext. And consistent use of these groups of letters makes for a word structure that is likely, as you say, even more predictable than Voynichese. Again, that is an clear echo of the Voynich text structure, but also again, not quite . . .

The part where Stenotype and the Voynich text seem to diverge is that the motivation for the approaches in the Stenotype are truly mechanically based -- that is, there exists for each sound (or sound and word position combination) a unique way of keying it so that for a particular key format there would be a unique set of key hits upon recordation which would produce output that is distinguishable on "decoding." Thus, the person keying in the information is able to encode and the person reading the paper strip decode without confusion.

I really wish that there was such a mechanical basis for the Voynich text -- and there still may be something mechanical involved. But the strongest argument against the use of mechanical approaches is the time after time examples of not even a single "rule" actually being fully adhered to. There is always an exception.* These kinds of results suggest human-based, imperfect process steps rather than a chart or grille or wheel or some other mechanical set up. Of course, it could be a combination (allowing human "choices" within the realm of a mechanical process) but even then, I am growing less certain of even this.

On the other hand, what I am becoming more and more convinced of is that for the Voynich text the whole idea of a requirement for a unique "set of keys" for each underlying sound or letter and as a result production of a distinguishable output for decoding, is just missing. Like a blind spot in the author(s) minds. Relatively consistent letter forms and certainly particular glyphs associations are really strong (also a requirement for certain distinctive "outward appearance"**), sure, but a consistent system for the whole output is just not there. The existence of the Currier "languages" is proof enough for me that this idea was lacking.

Please note that I strongly believe this blind spot is completely understandable, if taken in the context of a world view where consistent spelling is completely optional -- to the point where within the same sentence a word will be spelled more than one way (consistency is simply not a strong point or can't even be said to be a goal). Perhaps even more tellingly, there is this cultural necessity of "earning" the right to be able to understand what is being written only through membership in the group that has been deemed "worthy" of reading the content. Thus, making a cipher that is very, very hard to decipher (even to the point of impossibility) could actually be an admirable part of the end result, as long as the "accepted" group can read it (and if you had a key, you could). If this is the case, the Voynich text has obviously accomplished the author(s)' goal very, very well. To our eternal frustration. It's a real bummer to not be part of the "in" crowd.

So, that being said, thanks again for the thought experiment. Onward with our continued attempts to find perhaps the right combination of a mechanical process that has been mixed with human, error prone decision-making that hides it -- probably not on purpose (I don't think the author(s) were actively seeking to make it this tough), but only because the need to not do it the way it was done resides in a cultural blind spot.

Thanks,

Michelle

*And I don't think the commonly tossed in suggestion of "scribal error" is sufficient to explain these. The lack of consistency is somehow part of the process.
**which could just be the results of the other processes -- but it is so distinctive it would not shock me if the appearance is "on purpose."

Finally -- thank you for reading to the end of this little diatribe -- but it is nice to get these thoughts out of my head and recorded and maybe in front of others who are interested in considering.

RE: [split] Verbose cipher? - pfeaster - 31-12-2022

(28-06-2022, 05:01 AM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.The part where Stenotype and the Voynich text seem to diverge is that the motivation for the approaches in the Stenotype are truly mechanically based -- that is, there exists for each sound (or sound and word position combination) a unique way of keying it so that for a particular key format there would be a unique set of key hits upon recordation which would produce output that is distinguishable on "decoding." Thus, the person keying in the information is able to encode and the person reading the paper strip decode without confusion.

I really wish that there was such a mechanical basis for the Voynich text -- and there still may be something mechanical involved. But the strongest argument against the use of mechanical approaches is the time after time examples of not even a single "rule" actually being fully adhered to. There is always an exception.*

Belated thanks for these comments, which I've been mulling over in odd moments since you wrote them.

I agree that any similarities between Voynichese and Stenotype are unlikely to be due to them sharing any similar mechanical motivation. Even if there was some kind of mechanical tool employed in composing Voynich text, it surely wouldn't have been anything like a twentieth-century stenographic keyboard.

With Stenotype, the encoding is made verbose so that each phonemic element can be represented by a combination of keys being pressed by up to four fingers at once, with each key leaving its own separate and distinct mark on the Stenotype paper (at least, before the advent of modern paperless digital systems). Some pairs of keys are arranged so that they can be pressed simultaneously with a single finger or thumb, but even so, the longest combinations assigned phonemic values seem to consist of four characters. The marks made by the keys are conventionally STKPWHRAO*EUFRPBLGTSDZ, always in that precise order, with each mark either present or absent (leaving a space) in each syllabic chord. The marks could, I suppose, have been entirely arbitrary, or represented by dots, and I assume they were given the forms of these specific letters mainly for convenience and as an aid to learning the system and reading the records. The invariable order of the characters is due to the typewriting mechanism. However, if the order of some characters were somehow reversed -- even though I have no idea how that would happen -- this wouldn't affect legibility, even for the four characters that appear twice in the sequence (STPR), as long as they remained within the correct grouping: STKPWHR for initial consonants, AO*EU, or FRPBLGTSDZ for final consonants. All that really matters is that the initial-consonant, vowel, and final-consonant sections of chords are distinguishable (and if we were to substitute unique characters for STPR on the right-hand side, we could scramble the sequence however we like). If there were a handwritten variant of Stenotype, it might contain many "exceptions" -- for instance, hapax legomena that could be transformed into common words by reversing the order of a couple characters.

So are there other motives, not involving a stenographic keyboard, that could have led someone to design a cipher that worked along similar lines?

I was much struck by Nick Pelling's You are not allowed to view links. Register or Login to view. "Fifteenth Century Cryptography Revisited," and particularly by its conclusion that the initial motive for increasing complexity among Italian ciphers was to thwart efforts to break them specifically by analyzing the final characters of words, and not to frustrate any more sophisticated form of frequency analysis.

One way to accomplish that end is, of course, to supply homophones for the plaintext letters that most often end words (typically vowels).

But another strategy would have been to provide separate cipher characters for letters at the ends of words and for those same letters elsewhere within words. That way, even if someone were to have figured out the cipher characters for [a], [e], [s], etc. by examining the final characters of words, that wouldn't have brought them any closer to deciphering any of the earlier parts of words.

The ciphertexts in the You are not allowed to view links. Register or Login to view. run all the words together with no spaces, which is of course another strategy for thwarting analysis of word-final characters, although it can also make the results harder for the intended recipient to read. But its simultaneous use of separate characters for VC combinations would, by itself, also have had the effect of enciphering most vowels differently at the ends of words than elsewhere within words, e.g.

L-A S-O-A S-ER-EN-IT-A E R-ES-T-AT-A C-ON-T-EN-T-A

"Ah, so you figured out [a], did you? Well, good for you, but that's not going to help you identify [a] anywhere else in a word -- ha ha!"

Still, the Archivio Sforzesco cipher isn't very easy to read (for me, at least) because there are so many different character forms to keep track of -- so many that it's annoying to have to search my key for them. There are some similarities among "related" ciphertext characters that make this a little easier, but no real rhyme or reason to the system as a whole. Good for security, I suppose, but bad for usability.

So let's imagine someone who shared the same concern as the Archivio Sforzesco folks with certain kinds of ciphers being too easy to crack, but who also dreaded the prospect of needing to write hundreds of different cipher characters distinctively from one another, and to look them up in reading by searching for them by shape in a gargantuan table. We can all empathize, yes?

One option would be to break the text into units equivalent to Stenotype chords with a three-part structure (initial consonant, vowel, final consonant). Particularly if the absence of a phoneme within a slot were marked in some way, the result could still have been equally good at disguising those telltale final letters of words, and there would have been many alternative ways to encipher a given plaintext besides, such that exact repetitions of long multiple-chord sequences would be unlikely (even if individual writers developed distinctive habits and preferences).

LA* SO* *A* SE* RE* NI* TA* *ER *ES TA* TA* CON TEN TA*
LAS *O* *AS *ER *EN *IT *A* *E* RES TAT *AC *ON TEN TA*

(Note the shorter complete and partial "repetitions," e.g., *ER *ES TA* TA*)

The divisions between chords are technically optional, but they're not arbitrary and do help somewhat with parsing. Perhaps very short chords that can easily be parsed at a glance can be appended to the preceding (or following) chords if desired, something like this:

LAS *O*AS *ER *EN *IT *A*E* RES TAT *AC *ON TEN TA*

Within a line, a combination such as [ST] would typically be split across two chords. However, if there's any impulse to avoid splitting words across line breaks (which I'll admit there often seems not to have been), it's easy to imagine certain adjacent combinations of cipher glyphs occurring almost exclusively at the beginnings and ends of lines (or at the beginnings and ends of "labels" and other shorter snippets of isolated text).

In this same scenario, it would be advantageous to encipher chord-initial consonants differently from chord-final consonants, partly to help make the chord structure clear to the reader, and partly to limit the payoff for any outsider who could figure out any word-final consonants as a crib. In this case, when we write [TAT], for example, the first and second tokens of [T] should be drawn from different parts of the key. To accomplish this, we could just increase the quantity of ciphertext glyphs -- but remember, that's precisely what our hypothetical cipher-designer is trying to avoid. Or we could instead turn to a pair of verbose keys.

(28-12-2022, 09:58 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.In a 1440 cipher key almost all substitutions are for Arabic numerals, which makes it a verbose cipher. However more interestingly a 1424 cipher has many substitutions for Roman Numerals; I noted when Claire Bowern did one of her presentations or wrote in one of her papers that she stated that Roman Numeral substitutions would produce many of the statistics found in the Voynich. So I think there is definitely some contemporary precedent with the Voynich manuscript for the use of verbose ciphers, which a Roman Numeral substitution cipher is.

So perhaps [T] might be enciphered [52] at the start of a chord but [eb] at the end of a chord. One advantage would be that individual chords would become longer and more deceptively "word-like" in length. Another advantage might be that certain glyphs or glyph combinations occur only at the start or end of syllabic chords, and so need to appear in only one or the other key, which could be helpful if we throw substring substitutions into the mix:

(28-12-2022, 09:58 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.By substring substitutions I mean substitutions for common word substrings. Using the English language as an example (the plaintext of the 1424 cipher is naturally latin):

"str" is a common start to some words in English. It is not an English word in and of itself, but a common part of a word.

"ing" is a common end to some words in English. It is not an English word in and of itself, but a common part of a word.

So consider:

"str" -> XIV
and
"ing" -> 45
then
"string" -> XIV45

The loss of information about word breaks might cause confusion, so it might be helpful to have some special means of showing where they go, especially for cases in which a word break occurs in mid-chord.

LA\* SO* *A* \SE* RE* NI* TA* \*E\R *ES TA* TA* \CON TEN TA*

LA\S *O* *A\S *ER *EN *IT *A* \*E* RES TAT *A\C *ON TEN TA*

But what if a word break were to occur not just in mid-chord, but in mid-substring? Say we're enciphering the phrase "THE BEST RING" using Mark's English-language example quoted above. Perhaps "ST_RING" could be written [XI\V45], even though "XIV" functions otherwise as a unitary whole. Or, if necessary, the word-break sign could even be inserted into a single cipher glyph representing multiple plaintext characters. Remind you of anything?

Perhaps at the start of a line the writer favors CV units but over time tends to be forced by circumstances to switch to VC -- one possible mechanism for producing subtler kinds of line-positional patterning.

LA* SO* *AS *ER *EN *IT *A* *ER *ES.....

Anyhow, this is the kind of mechanism Stenotype was suggesting to me, and it doesn't seem so extraordinarily far outside the mainstream of early fifteenth century cipher development to count as a wholly out-of-the-blue innovation, even if it has some unattested details. Or perhaps Mark has already run across a cipher key that would (or could) work exactly as I've outlined here. His work seems extremely valuable, and I don't for a moment doubt its potential relevance for Voynichology.

I know that EVA is designed to be "almost pronounceable," and that, in that sense, it tends to produce superficially multisyllabic "words" such as [qokeedy]. But I'd be very curious whether anyone has -- or can devise -- a scheme in which each Voynichese "word" comes out as a pronounceable monosyllabic unit equivalent to a Stenotype chord, without lots of implausible consonant clusters and such.