• A random language upon which to experiment
  • A random language upon which to experiment

    davidjackson > 23-03-2018, 08:28 PM

    Thought experiment in progress....

    We wish to analyse an entirely unknown language to see if our statistical models work upon it to produce meaning.
    The language should be European based, but not a formalised main stream lingua franca. The language must be organic, as opposed to constructed. At the same time, we need to be able to translate this language to ensure the accuracy of our models.

    For reasons of transcription, we use the European alphabet, but we are not concerned with standardised spelling.

    So, what language to use? Here's one example, Polari (You are not allowed to view links. Register or Login to view.).

    Quote:Polari (or alternatively Parlare, Parlary, Palare, Palarie, Palari; from Italian parlare, "to talk") is a form of cant slang used in Britain by some actors, circus and fairground showmen, professional wrestlers, merchant navy sailors, criminals, prostitutes, and the gay subculture. There is some debate about its origins, but it can be traced back to at least the 19th century and possibly the 16th century.


    So, my question: what analysis can we run upon this "language" to gain a base reading for our statistical analysis before applying them to the Voynich language? I have lots of half formed ideas, but I would value input before expounding upon them.
  • RE: A random language upon which to experiment

    DonaldFisk > 24-03-2018, 12:26 AM

    (23-03-2018, 08:28 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.Thought experiment in progress....

    We wish to analyse an entirely unknown language to see if our statistical models work upon it to produce meaning.
    The language should be European based, but not a formalised main stream lingua franca. The language must be organic, as opposed to constructed. At the same time, we need to be able to translate this language to ensure the accuracy of our models.

    For reasons of transcription, we use the European alphabet, but we are not concerned with standardised spelling.

    So, what language to use? Here's one example, Polari (You are not allowed to view links. Register or Login to view.).

    Quote:Polari (or alternatively Parlare, Parlary, Palare, Palarie, Palari; from Italian parlare, "to talk") is a form of cant slang used in Britain by some actors, circus and fairground showmen, professional wrestlers, merchant navy sailors, criminals, prostitutes, and the gay subculture. There is some debate about its origins, but it can be traced back to at least the 19th century and possibly the 16th century.


    So, my question: what analysis can we run upon this "language" to gain a base reading for our statistical analysis before applying them to the Voynich language? I have lots of half formed ideas, but I would value input before expounding upon them.
    Polari is a socialect - basically English with some alternative words to make it difficult for outsiders to follow.   Or am I missing something here?

    Why not Georgian?   It has its own alphabet which even looks a bit like Voynichese.   Can anyone here understand it or a related language?   I don't speak a word of it.   There are plenty of web pages in Georgian we can scrape.
  • RE: A random language upon which to experiment

    DONJCH > 24-04-2018, 10:31 AM

    (24-03-2018, 12:26 AM)DonaldFisk Wrote: You are not allowed to view links. Register or Login to view.
    (23-03-2018, 08:28 PM)davidjackson Wrote: You are not allowed to view links. Register or Login to view.Thought experiment in progress....

    We wish to analyse an entirely unknown language to see if our statistical models work upon it to produce meaning.
    The language should be European based, but not a formalised main stream lingua franca. The language must be organic, as opposed to constructed. At the same time, we need to be able to translate this language to ensure the accuracy of our models.

    For reasons of transcription, we use the European alphabet, but we are not concerned with standardised spelling.

    So, what language to use? Here's one example, Polari (You are not allowed to view links. Register or Login to view.).

    Quote:Polari (or alternatively Parlare, Parlary, Palare, Palarie, Palari; from Italian parlare, "to talk") is a form of cant slang used in Britain by some actors, circus and fairground showmen, professional wrestlers, merchant navy sailors, criminals, prostitutes, and the gay subculture. There is some debate about its origins, but it can be traced back to at least the 19th century and possibly the 16th century.


    So, my question: what analysis can we run upon this "language" to gain a base reading for our statistical analysis before applying them to the Voynich language? I have lots of half formed ideas, but I would value input before expounding upon them.
    Polari is a socialect - basically English with some alternative words to make it difficult for outsiders to follow.   Or am I missing something here?

    Why not Georgian?   It has its own alphabet which even looks a bit like Voynichese.   Can anyone here understand it or a related language?   I don't speak a word of it.   There are plenty of web pages in Georgian we can scrape.
    (Weeps)
    Why are you going for a tiny minority of European dialect (and yes it's a good thought but...) when you have not as far as I can tell ruled out ( for a 1"1 substitution in natural language) the big elephants in the room ie viz (Molesworth!) Mandarin, Mongolese, Hindi?

    "The World Wonders"! -  indeed. Big Grin

    Think about it...I have not seen a single mention or consideration of Mongolese on this site ever.

    I would love to see what Emma May Smith says of this, because she could, probably and rationally, rule all  this out a priori. And I would go with that, because she would be rational.

    Rule these 3  out - then we can all probably move onto considering code/cipher but not before.

    These 3 languages are not my pet theory, but just:- they need to be ruled out properly IMHO.
  • RE: A random language upon which to experiment

    Koen G > 24-04-2018, 12:45 PM

    A language family which often shows up in experienced researchers' (including Emma) lists of possibilities is Turkic languages. These cover large parts of Asia nowadays (see map) but have also been relatively close to Europe historically. 

    [Image: 320px-Lenguas_t%C3%BArquicas.png]

    If the VM actually contains a fairly plainly written natural language, it is not unlikely to be a Turkic language in some form

    I don't necessarily blame people of Eurocentricism. It's just a lot easier to test with languages one is somewhat familiar with. Latin, given its relative stability and omnipresence in medieval Europe's written culture is the prime example of a language which is both historically likely and accessible to us.

    Problem is, time and time again it turns out that it's impossible to turn Voynichese into Latin by any viable method.

    So yes, we must look somewhere else. But it's not sufficient to take for example a 15th century Mandarin text and see if it maps to Voynichese (it obviously doesn't). The VM script as a whole is unattested, which means that - if it was used to put a language to paper - someone used a non standard way of turning that language's sounds into script. 

    All this means that it's excessively hard for us, most of us who were raised in an Indo-European language, to decently test non-Indo-European theories.

    Just to give you an idea. A while ago we were talking on the forum about whether it would be possible to locate a word for "and" in Voynichese. I went to google translate and it gives me these possible translations for "and" in Chinese:

    和, 与, 而, 及, 而且, 并

    Errr... Clearly Voynichese does not use a writing system like Chinese, which means that if it is such a language, it must be written phonetically (with the possibility of all or some vowels being left out, imperfect transcription and so on). So in order to test this properly, this means figuring out the pronunciation of the Chinese words for and in the 15th century. Or at least have a fair idea of what they are. And then check the same for dozens of other dialects Smile

    Just to say, it's really difficult without the proper linguistic knowledge.
  • RE: A random language upon which to experiment

    DONJCH > 28-04-2018, 10:11 AM

    David: Oops, I just read the forum rules and hope that I have not derailed your thread in my enthusiasm. I do not want to be "that guy".

    Koen: Yes, which is why I was asking on the other thread if there were any native Chinese Voynicheros?

    You make excellent points. Turkic would be a good choice, I know there was that Turkish family - has that attempt been critically reviewed?

    I don't think I was proposing that VMS was a transcription of a Chinese document or done by a Chinese scribe.

    If anything I am imagining not Bacon but somebody like him who had some contact with the language and used it as an inspiration - not necessarily as a hoax but rather as a stunt. For instance if he was like Bacon rather crusty and abrasive, got into trouble with the hierarchy, and was put in prison/house arrest/promoted sideways for a time - then went "I'll show them!" and created the VMS in his newly found down time.

    Human nature and academic rivalry remains the same over the ages! I've seen it up close and man, it's scary - no quarter is given. It makes "Shogun" look tame, though they draw the line at murder I guess.
  • RE: A random language upon which to experiment

    Koen G > 28-04-2018, 11:24 AM

    The problem with the Turkish family is that they have not released their method. So no way to review anything yet, unfortunately.
  • RE: A random language upon which to experiment

    DONJCH > 28-04-2018, 02:58 PM

    (28-04-2018, 11:24 AM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.The problem with the Turkish family is that they have not released their method. So no way to review anything yet, unfortunately.

    "Handwavium et anagramium" I guess.
  • RE: A random language upon which to experiment

    DonaldFisk > 28-04-2018, 06:42 PM

    (24-04-2018, 10:31 AM)DONJCH Wrote: You are not allowed to view links. Register or Login to view.Why are you going for a tiny minority of European dialect (and yes it's a good thought but...) when you have not as far as I can tell ruled out ( for a 1"1 substitution in natural language) the big elephants in the room ie viz (Molesworth!) Mandarin, Mongolese, Hindi?

    "The World Wonders"! -  indeed. Big Grin

    Think about it...I have not seen a single mention or consideration of Mongolese on this site ever.

    I would love to see what Emma May Smith says of this, because she could, probably and rationally, rule all  this out a priori. And I would go with that, because she would be rational.

    Rule these 3  out - then we can all probably move onto considering code/cipher but not before.

    These 3 languages are not my pet theory, but just:- they need to be ruled out properly IMHO.
    I wasn't suggesting that the Voynich Manuscript might be written in Georgian.   I'm suggesting that you could use Georgian (or any other language you don't know) as a testing ground for ideas.   Myself and several others have suggested that the Voynich Manuscript is meaningless.   Can we show that it lacks a property present in Georgian text?   Alternatively, what properties of Georgian show it's a real language?

    I have in fact examined Georgian, here: You are not allowed to view links. Register or Login to view..   This is quite a good way of identifying alphabetic languages, as the position of a glyph on the plot is determined by the frequency of the letters which follow it.   It's also immune to simple substitution encipherment.   I have also more recently plotted Pinyin (i.e. Romanized Mandarin) text and Latin text with long and short vowels indicated, but have yet to add this. They're both closer to the Voynich Manuscript plot than those of the other languages I've examined, but nowhere near close enough to suggest a match.   (In any case, the payoff is that I can now display non-ASCII characters in X11 windows: You are not allowed to view links. Register or Login to view..)

    If anyone can provide me with text (a few pages will do) in a language they think Voynichese might be related to, along with a list of glyphs (letters or groups of letters which could be considered phonetic units, like "a" or "ee", or "th" in English), I can analyse it and generate a PCA glyph plot.   Alternatively, if anyone can suggest alternative EVA glyphs, I can reanalyse the VMS.
  • RE: A random language upon which to experiment

    MarcoP > 29-04-2018, 03:57 PM

    Hi Donald,
    it would be nice if your site could be added to the blogsphere reader!
    I find your PCA analysis very interesting. In particular, I love methods that generate simple 2D plots and PCA seems excellent in this respect. One thing that isn't clear to me is why most graphs have an empty area around the origin: Figure 8 (Rohonc) seems to be the only exception to this.

    The VMS graph seems to me comparable with several of the others, Italian and Rohonc (as you write) but also Georgian (Figure 6). It would be nice to have a quantitative way of measuring the similarity between these graphs, but I have no idea of how this could be done!
  • RE: A random language upon which to experiment

    DonaldFisk > 12-05-2018, 10:03 PM

    (29-04-2018, 03:57 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Hi Donald,
    it would be nice if your site could be added to the blogsphere reader!
    I find your PCA analysis very interesting. In particular, I love methods that generate simple 2D plots and PCA seems excellent in this respect. One thing that isn't clear to me is why most graphs have an empty area around the origin: Figure 8 (Rohonc) seems to be the only exception to this.

    The VMS graph seems to me comparable with several of the others, Italian and Rohonc (as you write) but also Georgian (Figure 6). It would be nice to have a quantitative way of measuring the similarity between these graphs, but I have no idea of how this could be done!

    I don't have an RSS feed so the blogsphere reader doesn't support it.   I'll post as a separate topic as I've updated with a lot of new material.