![]() |
Syllabification - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: Syllabification (/thread-201.html) |
RE: Syllabification - davidjackson - 09-03-2016 Quote: (Hope I did not make a typo anywhere, all this is a bit repetitive Can you imagine writing that all out with a quill and parchment, by candelight? ![]() RE: Syllabification - Anton - 09-03-2016 (09-03-2016, 02:03 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.This is a verbose cipher... every instance of the same plaintext word would yield the same sequence of ciphertext words Not exactly - note that in my example "oolooooo" can be reduced to "oola" (as I did), but also to "alooooo". Which would be two different ciphertext words conveying the same underlay character (i.e., space). If we use this trick not only for "o" sequences, but also for "l" sequences, introducing a fourth letter into the alphabet, the degree of variability will rise further. *** Basically, what makes the ciphertext "unpronounceable" is the comparatively low number of letters signifying vowels in the alphabet as related to consonants. I think that if one uses an alphabet with roughly the same number of vowels and consonants, any ciphertext will be more or less "pronounceable". Another cipher technique for making the unpronounceable ciphertext pronounceable would be just filling in "filler-vowels" according to a pre-defined pattern. Yet another technique would be separately enciphering consonants and vowels of the plain text. I think that's the trick that would make any ciphertext pronounceable. Consider an example which is neither substitution, nor verbose. Don't know if it is a known cipher, but I just borrow it from the differential encoding used in the telecom world. Here, each subsequent character is not enciphered per se, but what is enciphered is rather the difference between it and the preceding character. E.g., the English alphabet runs: A, B, C, D etc. Let's assume we encode each line from scratch and the first word of the line is the word "odd". "O" is the first character of the line. It is enciphered just by its number in the alphabet - i.e., "15". The second letter is "D", which is 16 letters forward from "O" ("O" being counted itself). Thus "D" is enciphered with number "16". The third letter "D" is one letter distant from the second letter "D" (because we count the letter itself), hence it is coded with "1". We could well measure the distance beginning with 0, not with 1, but beginning with 1 is convenient for subsequent mapping of numbers to cipherext letters. In our example, 15 is "O" in the English alphabet, 16 is P, and 1 is A. So we get "OPA" as the ciphertext for "ODD". Applied to the whole alphabet, such ciphertext would be unpronounceable for long phrases. What would make it pronounceable is considering two numbered rows - one for consonants (B, C, D, F...), another for vowels (A, E, I, O...) and enciphering consonants of the plain text with the first row and vowels of the plain text with the second row, using the same differential encoding as explained above. This way the phrase "this is a cipher", if I made no mistake, will be represented as "TNIM AB U KINSYL", which is quite pronounceable. RE: Syllabification - Emma May Smith - 09-03-2016 In very broad terms, the more vowels you assign to characters the more pronounceable a text becomes. This is because vowels form the nucleus of syllables* and in most languages** vowel-only syllables are valid***. So long as you're happy with every vowel representing a syllable, even a string like eiuoaueioa can be pronounced. The problems stack up as the ratio of consonants to vowels increases. As consonants are not typically the nuclei of syllables, any string of sounds must be split up into syllables according to the vowels available. Syllable parsing is often contestable, and different people may assign consonants to the end or beginning of neighbouring syllables depending on their viewpoint. But as the ratio of consonants increases so too does the number of sequential consonants occurring either side of vowel. These are consonant clusters. Some languages--such as English--deal pretty well with consonant clusters, allowing three or even four consonants in a row: strengths is a canonical example, with three beginning consonants and four**** ending consonants around a lone vowel. Most languages don't allow such complex syllables to exist, with a word like twin being more typical of the most complex syllables allowed. But the important thing is that no matter how complex syllables can be, they must adhere to a rule known as the sonority sequencing principle. Broadly, all sounds have a characteristic known as sonority, that vowels have the highest sonority of all sounds, that syllables should have a single peak of sonority, and that sonority within a syllable should increase and decrease regularly to and from that peak. In short, sonority is highest nearest the vowel. So, if I tell you that /p/ has a lower sonority than /l/, which naturally has a lower sonority than the vowel /a/, we can compose valid syllables from these three sounds: /pal/, /lap/, /alp/, and /pla/; and the invalid /lpa/ and /apl/. For the first two sonority goes up and then down, with the peak at the vowel; for the next two sonority starts high and falls or starts low and climbs, but again with the vowel being the high point; but for the last two sonority drops then rises, with both /a/ and /l/ being peaks either side of /p/. The principle is not hard and fast but it is a good guide. It is also a universal rule, governed by how humans speak rather than any given language. We internalize it as we learn to speak and can replicate it even without knowing of its existence. Thus when Rene and Landini made EVA pronounceable they were implicitly seeking out the sonority curve and applying sounds to it they knew would fit. The assignment of EVA characters is neither random nor, importantly, meaningless. Moreover, as the number of different characters within a text increases the only two options available to make that text pronounceable is either to make more of the characters vowels or to assign characters values that increasingly approximate the sonority hierarchy. * Consonants can form the nucleus of syllables, but this is less common and typically marginal in any given language. ** A few languages don't permit vowel-only syllables. The Voynich candidate-darling Hawaiian is supposedly one, but I think this is wrong. *** That is, phonologically valid. Such words can be pronounced even if meaningless. **** Even though written as three sounds: /ng th s/; it is often pronounced with an inserted /k/: /ng k th s/. RE: Syllabification - Sam G - 09-03-2016 (09-03-2016, 03:50 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.(09-03-2016, 02:03 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.This is a verbose cipher... every instance of the same plaintext word would yield the same sequence of ciphertext words It would be apparent pretty quickly that "a" and sequences of "o" are equivalent and that would allow the repeated sequences of words to be uncovered. Quote:Basically, what makes the ciphertext "unpronounceable" is the comparatively low number of letters signifying vowels in the alphabet as related to consonants. I think that if one uses an alphabet with roughly the same number of vowels and consonants, any ciphertext will be more or less "pronounceable". I basically agree (though I think you also need at least some regular structure to prevent occasional long runs of consonants), but the VMS text is actually pronounceable with far more consonants than vowels, which is going to be harder to achieve. Quote:Another cipher technique for making the unpronounceable ciphertext pronounceable would be just filling in "filler-vowels" according to a pre-defined pattern. The problem here is that the VMS vowels aren't thrown in at random. They have certain places within words where they can go, so you would need to explain the existence of those "vowel slots", which is going to be basically the same as accounting for the vowels themselves. Quote:Yet another technique would be separately enciphering consonants and vowels of the plain text. I think that's the trick that would make any ciphertext pronounceable. Consider an example which is neither substitution, nor verbose. Don't know if it is a known cipher, but I just borrow it from the differential encoding used in the telecom world. Here, each subsequent character is not enciphered per se, but what is enciphered is rather the difference between it and the preceding character. E.g., the English alphabet runs: A, B, C, D etc. Let's assume we encode each line from scratch and the first word of the line is the word "odd". "O" is the first character of the line. It is enciphered just by its number in the alphabet - i.e., "15". The second letter is "D", which is 16 letters forward from "O" ("O" being counted itself). Thus "D" is enciphered with number "16". The third letter "D" is one letter distant from the second letter "D" (because we count the letter itself), hence it is coded with "1". We could well measure the distance beginning with 0, not with 1, but beginning with 1 is convenient for subsequent mapping of numbers to cipherext letters. In our example, 15 is "O" in the English alphabet, 16 is P, and 1 is A. So we get "OPA" as the ciphertext for "ODD". This is clever, although without thinking about it too much it's not clear that it would be fully invertible once you split the consonants and vowels into separate rows, since you would then have more numbers than letters within each row. In any case though, you're basically preserving the consonant/vowel distinction found in the plaintext here, so this example actually sort of reinforces my larger point that it's hard to see the apparent consonant/vowel distinction made in the VMS text as meaningless. (And further considerations of word structure and things like that would of course show that the VMS text was not actually produced by this method.) RE: Syllabification - -JKP- - 09-03-2016 Abjads (scripts without vowels) by their very nature need to be more orderly and regimented than languages that include vowels from their inception. The system of adding in the vowels (something that is done in the head, not on the stone) is based on grammatical rules, so that words that look identical can be distinguished by context (e.g., "book" and "writer"). RE: Syllabification - Anton - 09-03-2016 Quote:This is clever, although without thinking about it too much it's not clear that it would be fully invertible once you split the consonants and vowels into separate rows, since you would then have more numbers than letters within each row. No, each row is numbered separately: B,C,D,F... -> 1,2,3,4... -> ciphertext consonants A,E,I,O... -> 1,2,3,4... -> ciphertext vowels Since the decrypter knows that consonants are encoded with consonants only, and vowels - with vowels only, he uses the consonant row whenever he encounters a consonant in the ciphertext, and the vowel row whenever he encounters the vowel. This way the cipher is unambigously reversible. *** Of course the examples that I provided are not meant to be claimed close to Voynichese, they are just two worked out offhand as suitable examples for a "pronounceable ciphertext". Personally I expect that the most productive approach to decrypting Voynich is the theory-independent one - since it elegantly circumvents the possible pitfalls of "cipher" and "language" theories by just postponing their application until some semantic relations are established or at least guessed between some Voynichese vords and certain objects or notions. Contextual analysis may help to trace those relations. RE: Syllabification - ReneZ - 10-03-2016 Pronouncibility is quite subjective, of course. čtvrt is a perfectly pronouncible word for Czech people, while many Asians cannot pronounce the English word shrimp. RE: Syllabification - crezac - 12-03-2016 (08-03-2016, 02:39 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view.(07-03-2016, 12:04 AM)Anton Wrote: You are not allowed to view links. Register or Login to view.Even if one puts apart the cipher theory, I am afraid that no syllabification is possible without our understanding of the alphabet. There is no confirmation that any of the transcription alphabets, EVA included, accurately represent the real alphabet adopted by the author. CR: Or the low entropy is telling you that EVA has some mistakes in it and that iii and ii and i may be three distinct characters or three distinct combination of anywhere from 1 to 4 characters. That the phonotactic structure isn't that rigid because you have a larger character set than you think you do. [b] Maybe it's telling you you don't know the alphabet as well as you think you do. Inconvenient if true. But believing that anything about VMS "has been well-established for a long time, and is obvious to begin with" is hubris and limiting.[/b] RE: Syllabification - crezac - 13-04-2016 I was doing some reading on the Cherokee syllabary recently. I found it interesting that the A E and I sounds are represented by the symbols D R and T. So if VMS is written in an artificial alphabet used to represent one or more languages that had no written form there is at least one contemporary example of a similar approach. And in the contemporary example, while characters are borrowed from other writing systems they do not represent the same phonemes. RE: Syllabification - Diane - 14-04-2016 This is a very interesting discussion, but I'm puzzled about why my name should have been brought into it. Since Anton couples my name with that of a Professor of Linguistics, I wonder if it isn't a slip, and the person mentioned was meant for another linguist, such as Anna May Smith? Anton said: Quote:No I am not disputing this ... but there are researchers who are. I won't speak for Bax or O'Donovan or others (let them defend their points of view themselves). What I mean is that scientific discourse should be based on criteria of scientific truth, not on assertions like "this is clear" or "this is evident" or "thus spake D'Imperio". My research has had nothing to do with linguistics, nor with the written part of the text at all. I work within my own field which is the provenancing of imagery in problematic artefacts. Since my conclusions are usually presented online with some of the historical and comparative iconographic evidence which led me to those conclusions, and no other qualified person has ever disputed either the evidence or the conclusions, I do not see that I have any need to defend my conclusions at present. Vague sneers and determined avoidance have been pretty much the only reactions I've seen by Voynicheros between 2008 and the present, so for that reason too I have had no reason to defend any of my conclusions, comparative evidence or reasoning. They have never been challenged or argued against. I wonder if Anton has misunderstood my use of the word "Latin Europe". It is used of that region whose dominant culture was Christian and official common language Latin. I might have said "western Christendom" but the term is out of favour. Saying that the imagery is not a product of that region and its dominant culture says nothing about the written part of this text. As example: suppose that an early copy of Aratus had been discovered, and that while its imagery was copied exactly, the text was translated into a form more congenial for its present time and owners. So the imagery would continue to speak of a non-Latin and non-Christian culture but the text might well be in Latin. Correctly identifying the region(s) and time(s) for first enunciation of this imagery, noting signs of later alterations and additions *to the original matter*, and finally positing a time, and cultural environment for the exemplars informing our present copy was not an easy task, but one of these days it may prove useful to those interested in what the manuscript actually contains, and where the material came from and so forth. Perhaps Professor Bax - like Anna Smith, or Rene Zandbergen and everyone else - feels his work also likely to be of use to others in the longer term? |