The Voynich Ninja

Full Version: Has anyone ever "deciphered" a paragraph?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6
(13-04-2021, 01:46 AM)obelus Wrote: You are not allowed to view links. Register or Login to view.
(12-04-2021, 08:14 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
(12-04-2021, 05:41 PM)Mark Knowles Wrote: You are not allowed to view links. Register or Login to view.It is perfectly possible for a cipher to decrease entropy.



It would be of great interest to have a historical example of this.



Morse code is a fine example.  The alphabetic representation of English is highly redundant at 4 bits/character;  Shannon himself demonstrated by experiment that the information content of written English is in the range 0.6-1.3 bits/character.  By exploiting the relative frequencies of English letters, Morse code compresses text to approximately 2.5 bits/character.

There's some misunderstanding here.

First, more bits per character mean less redundancy, not more. The more information per character on average, the less characters are needed to convey your message, hence the more efficient (or less redundant) you are with your writing.

So, if something is 4 bits per character, and then some procedure makes it 2.5 bits per character, then it's expansion, and not compression.

Second, one cannot be too frequent in reiterating that two H1 values are not directly comparable for alphabets of different size. What should be compared instead are H1-H0, or H1/H0 values.

Third, I guess what is stated here with reference to Shannon and compared to English H1 of 4 bits/character, - is not H1, but rather the stuff called "entropy rate", which is actually the limit of N-th order conditional entropy when N approaches infinity. The two parameters are not directly comparable, of course.

Fourth, I don't know what's the 2.5 figure for the Morse code. Assuming the alphabet size of 3 character (dot, dash, and pause (space)), no character entropy (no matter the order) for texts so encoded can exceed 1.58, since that would be H0 (logarithm of 3 base 2), and any entropy of order N would be less than H0.
(13-04-2021, 09:40 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.The fact that 'þe' is equivalent to 'the' is quite obvious, since the word is so frequent. Once you understand this, it's easy to see that (like in modern English) the article appears at the start of noun phrases and is typically followed by either a noun or an adjective (þe roote, þe flour, þe ȝonge sonne, þe Ram). Like in modern English, 'his' behaves similarly to 'the' (his schowres swoote, his swete breeth, his halfe cours). Like in modern English, 'and' connects two grammatical structures of the same kind (e.g. sentences or noun phrases). You can basically start from function words, which are almost identical to modern English, and work from there.
I can have trouble with the meaning of words like 'schowres',  'holte' or 'heetħ', but identifying part-of-speech categories is almost always straightforward, so it's easy to keep track of grammar, even when some of the meaning is lost.
The true Voynich translation will allow us to do just that: start from function words, identify basic grammatical structures and finally get to word meanings and translation.

The fantasy that language structure is a modern invention cannot possibly lead to anything interesting.

Also, even the word "the" will not necessarily be so easy and obvious to identify in every Middle English manuscript. As stated You are not allowed to view links. Register or Login to view., "ẏ" was a Middle English scribal abbreviation for the definite article "the" / "ye" (note also the two spellings). See further the information You are not allowed to view links. Register or Login to view.: "Middle English, y- (prefix) is often confused with You are not allowed to view links. Register or Login to view. (pronoun) or with You are not allowed to view links. Register or Login to view. (You are not allowed to view links. Register or Login to view.) or You are not allowed to view links. Register or Login to view. (article, definite) and the thorn You are not allowed to view links. Register or Login to view. due to typographic variation".

To make a speculative but logical comparison, the Voynich ms author (or one or more of its Hands) could have similarly used, let's say, EVA [ch-] prefixed to a noun to represent the definite article "the" / "ye". This word could also sometimes have been written out as EVA [chey] or [chy]. Incidentally, it so happens that this would be consistent with the idea of EVA [chedy] as English "yes".

Geoffrey
(13-04-2021, 01:32 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.if the transcription makes a basic parsing mistake in the 8th line of one of the most famous passages in all of English literature, it does not inspire confidence in the statistical accuracy of the rest of the transcription and a word frequency analysis based on it.


From the You are not allowed to view links. Register or Login to view. of the site hosting the transcription:

Quote:The Norman Blake Editions is a series of online editions which present full diplomatic transcriptions of the key, surviving manuscripts of Chaucer's Canterbury Tales.

In a diplomatic transcription of Harley 7334, the word must be transcribed as "I ronne" because this is how it was written by the scribe. The goal of the site and its relevance to some Voynich researchers is exactly that it allows to compare how different scribes wrote the same text. It seems you have no idea of what a diplomatic transcription is and you make the basic mistake of criticizing things that you do not understand at all  Big Grin

Of course the vast majority of the occurrences of "I" in the ms correspond to the personal pronoun: this can be easily checked by anyone who cares.
Marco, thank you for providing the additional information about the nature of the transcription, which was not made clear in your previous post. 

Now we can focus on the much more interesting and important topic, I think, of the Middle English scribal abbreviation "ẏ" for the definite article "ye"/"the" and the idea of EVA [ch-] as a prefix representing this abbreviation for the article. Then EVA [chey] and/or [chy] as the article as an independent word make logical sense too. EVA [chedy] as English "yes" would then be a speculative but interesting idea to consider as well. You say "start from function words, identify basic grammatical structures": this is a hypothesis to investigate along these lines. I note by the way that EVA [ch] even looks somewhat like a form of Middle English script "y" / "ȝ" turned on its side.

Geoffrey
"I have stopped arguing with reason. Bye."
I'm going to try to wrench this back towards, if not wholly, on topic...

I think if the Voynich is solvable, then it will be possible to convert a paragraph in a repeatable way.  I'm not sure if I'd agree that means the theory must be a one-way cipher without room for interpretation, though I can understand why.  You're right that no "solution" so far has been repeatable (except in the mind of the "solver"), but I'd attribute that to how all so far unfortunately approach their solutions, rather than a consequence of the text itself. 

One thing that seems clear is that the "true" Voynich solution - assuming it exists - is not going to be simple.  It would have been found already, since a succession of brilliant people have been working on it for decades.  It's not going to be something that a "fresh pair of eyes" suddenly spots, which unfortunately seems to be the mindset most people new to the manuscript have.

I imagine that if there's a solution out there, it's going to seem messy, complex, and appear to offer unattractive degrees of freedom.  But the important thing the solver will need to do - and none has bothered so far - is to translate not just a paragraph but pages and multiple ones in order to try to establish:

  1. a clear correspondence between their translation and the images/content of the pages that is so strong and consistent that sceptics would be hard-pressed to ascribe it to coincidence or selective interpretation
  2. clear and consistent grammatical rules in the original text, to the standard and extent that the apparent initial degrees of freedom are reduced, and the solver's interpretation no longer seems selective to those without an emotional investment in the theory.  These are more likely to emerge over the course of multiple pages, rather than just from a few lines. 

That may seem a big ask when we've yet to see even a proper paragraph by a repeatable method.  But, given the likely complexity of Voynichese, I don't think it's even possible to work out the system at play without working through a large chunk of the text, let alone ensure it is repeatable.  If it was as simple as someone stumbling on the initial foothold and then identifying the language, it would have been done already.  If a foothold is found, I can imagine it won't even be clear at the time that it is a foothold, and there will be painful crawling, sometimes back as well as forward, without knowing if one is anywhere near the top of the mountain.

So I doubt it is possible to work out the system enough to provide a letter to letter correspondence list without having developed it through translations of multiple pages.  I'm intensely sceptical of systems that are developed and then proposed on the foundation of a mere phrase or paragraph. 

And unfortunately that's what everyone does.  No one bothers to work their way through a quire with their system before proposing it as the next big theory/solution.  I think the excuse is usually that it's best left up to experts/native speakers.  But that's because they've run into difficulties extending the application of their system to a wider context, beyond the initial one or two words that seemed a perfect-but-coincidental match.  Or because extending the application of the system requires too much work, which itself implies the solver's brain is overburdened with the task of interpreting, filling in the massive gap left by their system which cannot produce repeatable results without the helpful brain. 

So, I think the issue is with the approach all those who have proposed solutions take, and I'd like to think that there is a viable - if complex - solution out there eventually to be dug up. 

Tldr; if I were world dictator, I'd ban anyone from submitting a solution unless they've done a rough translation of at least 20 pages from across the manuscript.  And I'd make a form for them to fill out answers to how their solution addresses each of the specific Voynichese mysteries.
Quote: ...if I were world dictator, I'd ban anyone from submitting a solution unless they've done a rough translation of at least 20 pages from across the manuscript....

I'm pretty sure if someone could "decrypt" (I'm using the term loosely) a full paragraph, that it would generalize in one way or another (conceptually if not specifically) to the rest of the manuscript.
"I have stopped arguing with logic. Bye."
I actually agree with tavie 100%. Now tavie and others may be surprised to see me write that, since they have seen me propose multiple ideas and theories about different languages (Judaeo-Greek a couple years ago, Old Polish more recently, now Middle English!?) and hypothesize correspondences, propose readings and interpretations of key words, phrases, lines, etc., on this forum. But my ultimate goal each time has always been to accomplish what tavie insists upon. I wouldn't be satisfied with my own solution until an established reputable professional scholar of the language that my theory proposes is convinced by my work and agrees with me. And yes, I expect it would take exactly what tavie insists upon in order to convince such a scholar.

Until that quantity and quality of work is accomplished, I view all of my own ideas, theories, correspondences, readings, interpretations of word, phrases, lines, etc., as provisional hypotheses. I admit that I do see Voynich Ninja as one big brainstorming session, and I view my posts about my theories as my contributions to that brainstorming session. There is a principle that some teams of colleagues adopt in brainstorming sessions, called "Say everything". In other words, don't keep your ideas to yourself, share them with the team to see if others have a useful or productive comment to make about them, whether that be criticism, pointing out weaknesses, or suggesting additional ideas that improve or refine the original idea or hypothesis. Since I view this Voynich Ninja forum in that spirit, I don't consider that I have ever "published" any of my theories about the Voynich manuscript text yet. I did post a draft document about my Old Polish theory to this forum recently, a draft which I had written last fall, but again that was a draft, not a final paper submitted for publication.

I found many of the critical comments about my Judaeo-Greek theory two years ago to be quite useful in pointing out the weaknesses of the hypothesis. Each character was too ambiguous, it could correspond with too many different possible Judaeo-Greek letters. Marco even wrote a text in "ambiguated English" as part of the effort to point out the weaknesses and difficulties with such a writing system. I'm glad I went through that discussion, so that I eventually realized my theory wasn't convincing. I thought the Old Polish theory was better, mainly because the idea of a verbose cipher (e.g., EVA [ok] = one letter, etc.) that I borrowed from Koen and others greatly reduces the ambiguity of the letter correspondences. But by posting my ideas to this forum, eventually multiple Polish speakers gave me negative feedback on multiple proposed interpretations of different passages on different topics. I appreciate in particular Gab19's effort to read my recently posted document and confirm that his feedback was still negative. I am stubborn, but not incorrigibly so.

On the other hand, I will still say that it is particular feedback about the weaknesses of a particular method or about the lack of quality of a resulting text in the target language that eventually persuaded me that my hypotheses were not panning out as promisingly as I had once thought they might. Just a general reference to the "four-step process" as a blanket dismissal of all such possible hypotheses, I'm sorry, I still don't find that argument persuasive. It is the critical feedback on the particular details of a particular method or of a specific passage of interpreted text that carries more weight. Some may believe I'm just making the same mistake over and over again; I disagree. I'm making one mistake, trying to learn from it and aiming to avoid making it again, and then making a different mistake, and hopefully learning from that one as well. Now the Middle English thing started as half-joking or mostly joking, but now I would say, who knows? From the general critical comments of Marco and others, I gather that just interpreting more lines of You are not allowed to view links. Register or Login to view. or whatever may be the wrong approach. It seems to me that an idea like EVA [ch-] = English "y" abbreviation for "ye"/"the" could be a better and more systematic structural approach to such a hypothesis. Or more generically if you prefer, EVA [ch-] = a definite article in a European language written with a less common letter ("y" rather than "d", "t", "l", etc.) and thus occurring less often in other non-initial positions, or something along those lines.

Sorry, I've rambled on too long. Tavie's post is more important than mine. I agree with it completely, even if some may believe I don't follow his advice in practice. My approach to investigating the Voynich ms text is different than others' approaches. Personally, I learn more and think of more ideas when I have a particular language in mind, rather than by generic statistical study of the structure of the ms text without any specific language in mind to give it context. I hope I can learn things from others, and I hope others can learn things from me, if there is anything there worth learning.
(13-04-2021, 09:01 PM)-JKP- Wrote: You are not allowed to view links. Register or Login to view.I'm pretty sure if someone could "decrypt" (I'm using the term loosely) a full paragraph, that it would generalize in one way or another (conceptually if not specifically) to the rest of the manuscript

That's my point - it's far too easy to link one paragraph alone to the concepts in the manuscript, and so it wouldn't be sufficient.  We've seen lines or even paragraphs before that could be related conceptually to the manuscript, but not enough to prove it isn't the result of coincidence or selective interpretation, conscious or unconscious.  Multiple pages would make it harder to conclude either of those. 

Anyway, my overall belief is that by the time a paragraph has been fully decrypted (in the sense of the genuine solution), the solver will have multiple paragraphs almost at that stage too because of the work required across the whole manuscript to unravel a complex system and all its rules (and will need to complete them to confirm the rules).   And if they don't, then that's a sign that either there's something wrong with their theory ... or somehow everyone has missed the signs of an incredibly simple system for decades, despite people attempting to plug in every known live and dead language under the sun.  And despite the obvious complexities in Voynichese...

If the solution were simple, a straightforward one-way cipher, then it probably would be more of a linear progression:  a foothold would be a breakthrough, translating one paragraph, with character identifications proceeding like dominos.  That one paragraph might be sufficient to establish the majority of the characters. 

But if the solution were complex, then the foothold is not a breakthrough.  There are going to be contradictions to your identifications, you get nonsense when you plug in the letters and find yourself straining to explain it away, and the rules will not be discernible from that one paragraph or indeed easily discernible at all.    And so you need a lot more material in order to try to work out what the rules are.  Likely from other quires and scribes.  So that's why I would expect that by the time (if!) a person has ever got a genuine solution for a full paragraph, they will have many others at a near state and would want to complete them in any case to confirm the extremely complex rules of the system.
Pages: 1 2 3 4 5 6