The Voynich Ninja
Look at *differences* between words rather than at the words themselves? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: Look at *differences* between words rather than at the words themselves? (/thread-2905.html)

Pages: 1 2 3 4 5 6


RE: Look at *differences* between words rather than at the words themselves? - Koen G - 27-08-2019

Right, I think I understand. So to to read this you would need to understand the operation that took place and convert this operation (rather than the resulting letters) into your source text. An advantage of this is that on the one hand you rely on the preceding, but on the other hand if you make one mistake it doesn't ripple through your entire code.

Complex codes are not my area of expertise, but it does feel doable if there are not too many steps per letter involved. Remember that the VM is over 200k characters. I think it would become too time consuming quickly once you add extra steps above simple substitution (i.e. transcription).

Another question I have is what effect such a code would have on typ-token ratio. Wouldn't you get an abnormal amount of unique words? If "bed" results in "azexiz", what are your odds of ever getting "azexiz" again? (I really don't know, I guess you'd have to test this).


RE: Look at *differences* between words rather than at the words themselves? - radapox - 27-08-2019

(27-08-2019, 07:57 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.So to to read this you would need to understand the operation that took place and convert this operation (rather than the resulting letters) into your source text.

Correct!

(27-08-2019, 07:57 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Another question I have is what effect such a code would have on typ-token ratio. Wouldn't you get an abnormal amount of unique words? If "bed" results in "azexiz", what are your odds of ever getting "azexiz" again? (I really don't know, I guess you'd have to test this).

Yes, I suppose that's a possible risk; I'm not sure about that either. Then again, the VM does have an unusually high frequency of unique words, if I'm not mistaken. Not sure if the numbers add up to a method like this; that would need to be tested as you say. 

Also, as You are not allowed to view links. Register or Login to view. remarked to my earlier post, this method would be problematic for Labelese. That's why I'm keeping as open mind a mind as possible regarding the multitude of different organizing principles that may be involved. Nick Pelling illustrated the possible complexities quite nicely in You are not allowed to view links. Register or Login to view.. [EDIT: Hey, wow, that post is exactly 10 years old today.]


RE: Look at *differences* between words rather than at the words themselves? - Koen G - 27-08-2019

I've got plenty of data in Voynich TTR, but I'd need at least 500 words (preferably more) of your code to test this properly. 
In short, averaged over 500-word windows, Voynichese has more unique words than a typical medieval German(ic) text, but fewer than the average Latin text.

[Image: naamloos-16-kopic3abren.gif?w=616]

You are not allowed to view links. Register or Login to view.

Of course this won't test everything, but TTR is a very accessible method for initial testing. So whenever you can whip up an at least 500-word code, I can tell you what it compares to.


RE: Look at *differences* between words rather than at the words themselves? - radapox - 27-08-2019

(27-08-2019, 08:31 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.So whenever you can whip up an at least 500-word code, I can tell you what it compares to.

Oh dear. Smile Well... I'm not making any promises, but I'll let you know if I happen to find the time (and coding skills, I suppose) for that!

EDIT: On a more serious note: I think a complicating factor is that a lot depends on how you map the mutations to the letters of your source language. If, for instance, frequent letters happen to correspond to more "drastic" mutations (ones with few zeros, in my method), adjacent words would become less similar than if the reverse were the case. So I expect that one source language may yield entirely different TTR results depending on your mapping.


RE: Look at *differences* between words rather than at the words themselves? - Koen G - 27-08-2019

(27-08-2019, 08:37 PM)radapox Wrote: You are not allowed to view links. Register or Login to view.
(27-08-2019, 08:31 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.So whenever you can whip up an at least 500-word code, I can tell you what it compares to.

Oh dear. Smile Well... I'm not making any promises, but I'll let you know if I happen to find the time (and coding skills, I suppose) for that!

Also time yourself while doing it, so we can answer multiple questions at once  Big Grin


RE: Look at *differences* between words rather than at the words themselves? - radapox - 27-08-2019

(27-08-2019, 08:41 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Also time yourself while doing it, so we can answer multiple questions at once  Big Grin

Wink In the meantime I added this edit to my previous reply, so for maximum confusion I'll repeat it here:

(27-08-2019, 08:37 PM)radapox Wrote: You are not allowed to view links. Register or Login to view.On a more serious note: I think a complicating factor is that a lot depends on how you map the mutations to the letters of your source language. If, for instance, frequent letters happen to correspond to more "drastic" mutations (ones with few zeros, in my method), adjacent words would become less similar than if the reverse were the case. So I expect that one source language may yield entirely different TTR results depending on your mapping.



RE: Look at *differences* between words rather than at the words themselves? - Koen G - 27-08-2019

Are you sure that's the case though? TTR doesn't look "inside" words, so even a change of one letter is a change. Since your method inevitably cycles around based on the previous letters, my intuitive expectation is that no matter how you divide your codes, TTR will be unusually high. But this is difficult to predict intuitively.


RE: Look at *differences* between words rather than at the words themselves? - radapox - 27-08-2019

(27-08-2019, 08:57 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.Are you sure that's the case though? TTR doesn't look "inside" words, so even a change of one letter is a change. Since your method inevitably cycles around based on the previous letters, my intuitive expectation is that no matter how you divide your codes, TTR will be unusually high. But this is difficult to predict intuitively.

Hah, good point there! Yes, of course, you're right. Different is different, it doesn't matter how different.


RE: Look at *differences* between words rather than at the words themselves? - radapox - 28-08-2019

Okay, so I couldn't resist giving it a try. Here's the first sentence of The Hobbit enciphered with the system You are not allowed to view links. Register or Login to view..

In a hole in the ground there lived a hobbit.

With spaces (mutation 000):
ba-cez cez ce ce caz-dyz-faz-fe fe ge-hiz hiz gix-gew-giv giv gew-few-gaw-few-giv-gow gow fov-fit-fos-dos-dur dur fyr-gyr-fas-fer-fis fis fit fit ges-has-har-haq-jaq-hap hap

Without spaces (mutation 000 used for punctuation only):
ba-cez ce caz-dyz-faz-fe ge-hiz gix-gew-giv gew-few-gaw-few-giv-gow fov-fit-fos-dos-dur fyr-gyr-fas-fer-fis fit ges-has-har-haq-jaq-hap hap

I've done this largely by hand using a primitive Excel sheet, so I may have made mistakes. I think it took me about two hours to do this, but that includes setting up the sheet, so encoding the thing itself may have cost me, say, twenty minutes? I can imagine a user could develop a good pace after some practice, and by using something like a letter circle (You are not allowed to view links. Register or Login to view.) to keep track of the consonants and vowels.

Below is a screenshot of the Excel sheet. The blue column contains the plaintext; the yellow column gives the mutations for each letter; the columns to the right of that contain the syllables/words of the ciphertext. Red shading indicates double values in the column at hand. Obviously, those are much more frequent if you look at separate syllables, especially if you choose to include spaces with mutation 000, as this leads to doubling of each word-final syllable.

An interesting tweak could be to insert more empties (Ø) in the consonant list (e.g. ØbcdfØghjkØlmnpØqrstØvwxz), so you end up with more syllables that lack one or both of the Cs, thereby increasing the frequency of similar syllables. I'll see what this does.


RE: Look at *differences* between words rather than at the words themselves? - -JKP- - 28-08-2019

Yikes, two hours.

My immediate reaction, of course, was, "Write a script!" but they didn't have that option in the Middle Ages.   :-)

PS, I commend your choice of The Hobbit.

Now... a classical Latin version of The Hobbit?   Big Grin


My first instinct was to look at letter position within words and your sample does indeed have some of those idiosyncracies.