The Voynich Ninja
Voynich text generation - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Voynich text generation (/thread-2684.html)

Pages: 1 2 3 4 5 6 7


RE: Voynich text generation - bi3mw - 14-04-2019

@nablator: No problem, take your time.


RE: Voynich text generation - Koen G - 14-04-2019

Interesting, JKP. Did you respect word boundaries?

So one null is your main deviation from one-to-one? I suspect o is a likely candidate because you can use it for padding, breaking up illegal clusters.


RE: Voynich text generation - geoffreycaveney - 14-04-2019

(14-04-2019, 11:09 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Interesting, JKP. Did you respect word boundaries?

So one null is your main deviation from one-to-one? I suspect o is a likely candidate because you can use it for padding, breaking up illegal clusters.

I'm guessing [e] is the null, because it occurs as a double [ee] four times, and only as a single [e] five times in the text, which is very strange. But we shall see.

Geoffrey


RE: Voynich text generation - -JKP- - 15-04-2019

(14-04-2019, 11:09 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Interesting, JKP. Did you respect word boundaries?

So one null is your main deviation from one-to-one? I suspect o is a likely candidate because you can use it for padding, breaking up illegal clusters.

Yes, one null is my main deviation. There is also one that might be a slight fudge, but not too much (there is a rationale behind it).

I really don't think of Voynich units as words. I never have. I think of them as tokens. If the VMS is a cipher, then token-length frequently does not mean word-length. Plus nulls are frequent in ciphers, including medieval ciphers.

I don't know if Voynichese is a cipher. Strange-looking shapes can mean a lot of things (numbers, sounds, etc.) and might be for convenience rather than to hide things (as in the alphabets the missionaries invented to try to express sounds from another language). But it might be a cipher, in which case tokens/blocks/units (or whatever you are comfortable calling them) and words are not necessarily synonymous.


RE: Voynich text generation - geoffreycaveney - 15-04-2019

(15-04-2019, 06:08 AM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.
(14-04-2019, 11:09 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Interesting, JKP. Did you respect word boundaries?

So one null is your main deviation from one-to-one? I suspect o is a likely candidate because you can use it for padding, breaking up illegal clusters.

Yes, one null is my main deviation. There is also one that might be a slight fudge, but not too much (there is a rationale behind it).

I really don't think of Voynich units as words. I never have. I think of them as tokens. If the VMS is a cipher, then token-length frequently does not mean word-length. Plus nulls are frequent in ciphers, including medieval ciphers.

I don't know if Voynichese is a cipher. Strange-looking shapes can mean a lot of things (numbers, sounds, etc.) and might be for convenience rather than to hide things (as in the alphabets the missionaries invented to try to express sounds from another language). But it might be a cipher, in which case tokens/blocks/units (or whatever you are comfortable calling them) and words are not necessarily synonymous.

This exercise is an excellent reminder that even a standard "Sunday paper cryptogram puzzle" with one-to-one substitution can be difficult and challenging, when (1) you don't know which language it is written in, (2) even if you think you have guessed the language correctly, you are not a native speaker of it, and (3) word breaks in the cryptogram may not match word breaks in the text of the underlying language.

Geoffrey


RE: Voynich text generation - -JKP- - 15-04-2019

(15-04-2019, 12:13 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view....


This exercise is an excellent reminder that even a standard "Sunday paper cryptogram puzzle" with one-to-one substitution can be difficult and challenging, when (1) you don't know which language it is written in, (2) even if you think you have guessed the language correctly, you are not a native speaker of it, and (3) word breaks in the cryptogram may not match word breaks in the text of the underlying language.

Geoffrey

Yes, I agree with these points. And I really had to learn firsthand how MUCH of an obstacle it can be to not know the underlying language. I'll wait a few days before posting the answer so people have a chance to give it some brain cycles.


RE: Voynich text generation - geoffreycaveney - 15-04-2019

(15-04-2019, 12:16 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.
(15-04-2019, 12:13 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view....


This exercise is an excellent reminder that even a standard "Sunday paper cryptogram puzzle" with one-to-one substitution can be difficult and challenging, when (1) you don't know which language it is written in, (2) even if you think you have guessed the language correctly, you are not a native speaker of it, and (3) word breaks in the cryptogram may not match word breaks in the text of the underlying language.

Geoffrey

Yes, I agree with these points. And I really had to learn firsthand how MUCH of an obstacle it can be to not know the underlying language. I'll wait a few days before posting the answer so people have a chance to give it some brain cycles.

To have any hope of solving this, I'm going to have to ask for a clarification on the consistency of the character [a] in this cipher:

In the cipher text [a] appears 11 times as part of [ain], 9 times as part of [aiin] (including the endings of all 4 lines!), and 8 times elsewhere without [(i)in] following it.

Does a one-to-one substitution mean that all of these [a]'s represent the exact same letter in the original source language text?

I take it that [in] and [iin] are each a single unit, one letter each, and probably different letters in this cipher. But I take the [a] as its own distinct unit and letter.

It is striking that [aiin] is almost always preceded by [o], except for one instance by another [a], but [ain] on the other hand is never preceded by these characters, but rather by such characters as [p], [d], [cth], [s], and [sh].

Now this patterning is possible in Greek for example, where a final "-os" would often be preceded by another vowel, but a final "-on" might much less often be preceded by another vowel. Nevertheless, Greek final "-os" is also often preceded by consonants as well, so I wouldn't expect the universal "V+os" sequence in every occurrence.

So far the complication of resolving this unusual patterning seems to be the biggest obstacle to making progress with this cipher, at least for me.

Geoffrey


RE: Voynich text generation - -JKP- - 15-04-2019

(15-04-2019, 02:25 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view....
To have any hope of solving this, I'm going to have to ask for a clarification on the consistency of the character [a] in this cipher:

In the cipher text [a] appears 11 times as part of [ain], 9 times as part of [aiin] (including the endings of all 4 lines!), and 8 times elsewhere without [(i)in] following it.

...

I'm not sure how to answer this without giving too much away. The prevalence of "a" is a linguistic reality in some languages.


RE: Voynich text generation - geoffreycaveney - 15-04-2019

(15-04-2019, 03:36 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.
(15-04-2019, 02:25 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view....
To have any hope of solving this, I'm going to have to ask for a clarification on the consistency of the character [a] in this cipher:

In the cipher text [a] appears 11 times as part of [ain], 9 times as part of [aiin] (including the endings of all 4 lines!), and 8 times elsewhere without [(i)in] following it.

...

I'm not sure how to answer this without giving too much away. The prevalence of "a" is a linguistic reality in some languages.

Fair enough! Thank you.


RE: Voynich text generation - geoffreycaveney - 17-04-2019

Some comments on JKP's cipher, based on my study of it so far:

Since the cipher word boundaries do not necessarily correspond to plain text word boundaries, I removed all cipher text spaces and took a look at the running cipher text. Some peculiar letter sequence properties emerge:

[t] occurs 9 times, but 8 of those are as part of the sequence [ot] ! This is not typical for a one-to-one substitution of a natural language. Now if Koen is right, [o] could just be the null character, explaining this and other sequences. But I find the number and location of potential vowel and consonant characters don't pattern so well without [o]. (In particular, I find a lack of a sufficient number of distinct sufficiently frequent characters to represent all of the distinct frequent vowels of a natural language, if [o] is a null and not a vowel. In short, I can't find good candidates for each of "a", "e", "i", and "o", and still have enough consonants left in the text, if [o] is not one of the vowels.) The text seems more plausible with [o] as a vowel of some type. Alternatively, [t] could be the null character. But [t] actually fits smoothly as a potential consonant between potential vowels, so this is not my preferred hypothesis either.

More remarkably, the unusually frequent repeated sequences extend beyond just [ot]:

5 of the 8 [ot] occurrences are as part of [ainot], and they are all in the first two lines!
[ot] further occurs to begin the second line, and possibly to begin the first line, if [f] is a pilcrow and not a letter.

Finally, most remarkably of all, there is a very long almost repeated sequence, with only a few letters between them, in the second line:

[aryky(y)painot]

This is the full sequence in the second line, with my own spaces to highlight this sequence:

[ otecth  arykyypainot  yat  arykypainot ]

The end of the second line is striking as well:

[ acthainain  otiry  cthainoaiin ]

I highlight [otiry] with spaces because it also occurs to begin the first line, if [f] is a pilcrow.

[k] has some unusually frequent sequencing properties as well:
5 of its 9 occurrences are as part of [yk], including the 2 times in the long almost repeated sequence in the second line above. 2 of the others are as part of [yyk].
[k] also occurs 3 times as part of [ok], and once as part of [eek]. The double characters (long vowels?) before and after [k] in the cipher text are striking. There is also the [kyy] in the first long repeated sequence above, and [kaa] near the end of the fourth line is the only [aa] in the whole text. In the first line there is even [eeyyk].

And all of this occurs in a basically one-to-one substitution cipher!

Geoffrey