The Voynich Ninja
The gibberish thread - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: The gibberish thread (/thread-2277.html)

Pages: 1 2 3


The gibberish thread - davidjackson - 04-02-2018

Let us say that the text is assumed to be gibberish.
How could we prove this? What level and type of text analysis could we carry out?

Note: I'm not arguing that it is. I'm asking for thoughts on how we could prove that it is, were we so inclined, as this could bring up some interesting new angle of textual attack.


RE: The gibberish thread - Koen G - 04-02-2018

Just thinking about it philosophically (i.e. without much knowledge of the statistics involved) I'd say that it's impossible to prove. The reason is that in this particular case the statements "it has no meaning" and "the meaning is unknown to us" have the same effect. 

Let me think of an analogy. For example, you present me with a string of numbers, 22 15 25 14 9 3 8 0 14 9 14 10 1.

I could say "it's a random string of numbers". But I don't know whether this is true. Until I find a way to read the numbers in a way which makes sense, I have no way of proving whether the numbers are randomly chosen or not. I could only disprove it by pointing out what you encoded and how you did it.


RE: The gibberish thread - -JKP- - 04-02-2018

That's an interesting question.

I don't know if it's an answerable question (I need to think about it for a while) or a productive question (maybe, maybe not, depending on how the question is approached) but I don't think I've seen it posed before (I've seen it stated as as theory, but not as a question) and when I read the question, I realized I've never approached the VMS text from that angle—I've always looked at it as maybe (hopefully) having meaning rather than assuming that it does not or might not.


RE: The gibberish thread - Emma May Smith - 04-02-2018

I'm sure that the Voynich text has been compared against other known "gibberish" texts. Specifically, texts written by those living with certain mental illnesses.


RE: The gibberish thread - Torsten - 05-02-2018

There are different kinds of gibberish. The text of the VMS is highly structured and repetitive. In this way I find it interesting that an alternative term for gibberish is jibber-jabber. The german word for gibberish is blabla. It seems that this words already suggest repetitive phrases like 'chol chol chol' or 'sho chol shol' (see You are not allowed to view links. Register or Login to view.).

Instead of comparing the text to something we know I would suggest to analyze the text as something we didn't know. This means that we should analyze the text without any preconditions. For instance we could ask questions like: 
What ist the best way to describe the VMS as it actually is? 
Why does the text evolve over time? 
Why did even rare features repeat in horizontal or vertical directions (see for instance 'shek', 'sheek' and 'shek' on page f103r)? 
Why 'k' is rarely used in line initial position ('k' is used 129 out of 10934 times or 1% in line initial position, 't' 6%, 'p' 24 %, 'f' 8%)?  
Why are 'chol', 'chor', 'shol' and 'chor' more frequent then 'chal', 'char', 'shar' and 'shal'? 
Why a word starting with 'q' comes in 65 % of the cases after a word ending with 'y'? 
Why is the letter after 'q' in 97,5 % of the cases a letter 'o'? 
...


RE: The gibberish thread - Hubert Dale - 05-02-2018

(04-02-2018, 09:27 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Just thinking about it philosophically (i.e. without much knowledge of the statistics involved) I'd say that it's impossible to prove. The reason is that in this particular case the statements "it has no meaning" and "the meaning is unknown to us" have the same effect. 

Let me think of an analogy. For example, you present me with a string of numbers, 22 15 25 14 9 3 8 0 14 9 14 10 1.

I could say "it's a random string of numbers". But I don't know whether this is true. Until I find a way to read the numbers in a way which makes sense, I have no way of proving whether the numbers are randomly chosen or not. I could only disprove it by pointing out what you encoded and how you did it.

Well, the human beings here will know it’s not a random string of numbers Smile. But I do wonder whether software like CryptoCrack can do anything with this particular message, given the brevity and vocabulary?


RE: The gibberish thread - davidjackson - 05-02-2018

Quote:But I do wonder whether software like CryptoCrack can do anything with this particular message, given the brevity and vocabulary?

Counting on your fingers would break that particular Caeser's Cipher Big Grin


RE: The gibberish thread - Hubert Dale - 05-02-2018

(05-02-2018, 07:16 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.
Quote:But I do wonder whether software like CryptoCrack can do anything with this particular message, given the brevity and vocabulary?

Counting on your fingers would break that particular Caeser's Cipher Big Grin

Worked for me. Got tricky after 10 - Anne Boleyn would have done better - but I coped Smile.  And I suppose it’s really a monoalphabetic substitution cipher rather than a Caesar but that’s by the by.

I was just wondering how cipher-breaking software works. It’s obvious to a human to try A=1, B=2...when faced with a string of numbers going no higher than 26. Are programs like CryptoCrack designed to do the same?


RE: The gibberish thread - Koen G - 05-02-2018

My point is not that it's easy :p It's more of a philosophical consideration about the state we are in now with the VM. 
Statistics will only give us indications for one way or the other, but we'll only be able to prove anything when we crack it. Proving it to be nonsensical is impossible. Especially since, despite everything, it is so very language like.

Torsten: many of those questions could be answered hypothetically within a linguistic framework. For example:
  • Why does the text evolve over time?  This could be because of the different people involved. Or a gradual a change in approach as the encryption/transcription method matured. Or adaptation to changes in the source material. Or many other things.
  • Why did even rare features repeat in horizontal or vertical directions (see for instance 'shek', 'sheek' and 'shek' on page f103r)?  Specialized vocabulary in a reduplicating language, for example.
  • Why 'k' is rarely used in line initial position ('k' is used 129 out of 10934 times or 1% in line initial position, 't' 6%, 'p' 24 %, 'f' 8%)?  When [k] appears line-initially it may be expressed in another way (i.e. position sensitive spelling variation).
  • Why are 'chol', 'chor', 'shol' and 'chor' more frequent then 'chal', 'char', 'shar' and 'shal'? This could be so many things. These words appear somewhat similar. Maybe one vowel was simply more frequent in such words than the other. Or maybe it's variation in the transcription system. Arabic "a" can often be transcribed as "a" or "e". A scribe can use both but favor one over the other.
  • Why a word starting with 'q' comes in 65 % of the cases after a word ending with 'y'? This is something to ask Emma, but in short, it could be phonetically determined. Like a linking sound, or a sound changing under the influence of the other. Or it could have something to do with false spaces, if you want to go that way.
  • Why is the letter after 'q' in 97,5 % of the cases a letter 'o'? Could be similar to q and u in English.
These are just some things off the top of my head. Natural language is so rich and unpredictable, and even more so when people start putting it to paper without spelling standards.


RE: The gibberish thread - Torsten - 05-02-2018

(05-02-2018, 07:44 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Why are 'chol', 'chor', 'shol' and 'chor' more frequent then 'chal', 'char', 'shar' and 'shal'? This could be so many things. These words appear somewhat similar. Maybe one vowel was simply more frequent in such words than the other. Or maybe it's variation in the transcription system. Arabic "a" can often be transcribed as "a" or "e". A scribe can use both but favor one over the other. 

Have you read the blog post from Nick Pelling about the [chol] [shol] mystery on page f42r? (see You are not allowed to view links. Register or Login to view.)

Quote:Torsten: many of those questions could be answered hypothetically within a linguistic framework.

There is a big difference if you start with the precondition that the VMS contains language or not. In the first case you know from beginning that it contains language. But you still can't identify the language. In the second case you can come to the conclusion that it is a language or not. If you come to the conclusion that it is a language you also know there most characteristic features. At least you can say something about the linguistic typology of this language and about some basic grammar rules. With this features it should be very easy to identify the language used.