The Voynich Ninja

Full Version: How to recombine glyphs to increase character entropy?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8
(21-04-2022, 08:08 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.
(21-04-2022, 06:54 PM)Searcher Wrote: You are not allowed to view links. Register or Login to view.I transliterated EVA to make almost full alphabet text.
All the calculations of letter frequencies have already been done 20 or 30 years ago, I don't remember if the difference was made between A and B languages. Nick has spoken several times on his site about the need to treat them separately.

And I still don't understand why at this stage replace the EVA letters, since you don't know their true value anyway. Moreover, I didn't see q EVA in your alphabet
 I hadn't too much time to make all the tests. I have one file of the whole text. To make tests of A and B languages I need to select them in the other place, for example on the voynich.nu. I just never made such tests, so I'm not ready to make them quickly. My main aim wasn't just making calculations. I show that the shares of many Voynich bigrams in the text are too high for normal languages. It is written in my post. And no changes make the statistics much better. Changes are made to show how much they influence on the statistics. It seems Koen already explained the same. 
In my transliteration, q - null (by some reason). You may not pay attention to this nuance, as it also doesn't influence the whole picture.
Fortunately for me, I am not very interested in statistics, because it makes me dizzy. Especially as finding texts to compare is an impossible mission.
I should note that I don't have programming skills, so I work in more primitive ways, with the help of available tools (Word, WordPad, etc.). However, by showing these examples and comparisons, I wanted to clearly demonstrate the essence of my question. If the frequency of occurrences in the text of a group of bigrams in the amount of more than ten exceeds the allowable norms for ordinary languages, how can this be changed? Do you (everyone who tests with programming) mean that two (or even three) simple glyphs from a frequently repeated n-gram can stand for a single letter of plain text? What approaches could you take to reduce the frequencies of such bigrams or increase the rare bigrams? Testing and numbers are good, but I would like to see any visual logical explanation with specific examples. Or the task to check the data will appear just after  machine will give suitable results?
I have made the same tests with the text of the two folios 85 and 86. There are again too many (about ten) bigrams that exceed the normal level of frequency.
[attachment=6442]
I support anyone who is interested in "solving" the text to perform the basic frequency analyses like Searcher.  Nothing is more convincing than looking at the numbers yourself and realizing the issues that are being faced in getting Voynichese (at the glyph level) to look and act like natural language.

Maybe this explanation will help for understanding the issue better.  This approach is far, far from being unique to me, but it helps me keep on track when trying to come up with ideas to test.

Voynichese is incredibly contradictory in that depending on the level of examination the statistical results look like something that could be a natural language.  However, these initial impressions do not hold when moving from level to level.   The place where Voynichese becomes most dysfunctional, in my opinion, is at the glyph level.  Of course, there is controversy at what constitutes "a glyph" and that is where all the various transcription approaches come into play.  But that issue aside, it remains that the text does not act like any historically feasible natural language to allow an association between particular glyphs (no matter how they are defined) and how alphabetic letters are used in such natural languages.

One major issue is the low character entropy exhibited by Voynichese.  Having one glyph in hand, it is simply too predictable what the next glyph is.  Searcher described it a different way, but it is just another way of saying the same issue -- certain bigrams (and trigrams) are enormously over-represented if the text is to be seen as a natural language.  In my opinion, the clearest conclusion from this result is that it is unlikely to the point of impossible for Voynichese to be a 1:1 substitution no matter how many languages/dialects are looked at.

So "something else" or, almost certainly at this point, a number of "something elses" have to be going on, if you are going to hold with the position that Voynichese is a repeatable cipher process with a clear text underneath.

Koen's work isolated one possible "something else" approach.  Can collapsing certain bigrams and/or trigrams into single entities "fix" the very low character entropy?  And in particular, which particular bigrams and trigrams give the best "fix" to the low character entropy?  This is a complex problem that Koen had to act like a computer optimization program to answer.  RobGea has subsequently (in this thread) confirmed his results using more traditional computer testing.        

The resulting list of bigrams and trigrams is the list I am talking about.  Because I really wanted to solve the entropy issue and I like the idea of a verbose cipher, I have used this list (in combination with other cipher approaches -- namely some historical verbose ciphers) to try to come up with a way to encode Latin and German, specifically Bavarian, into Voynichese.  I started with Latin and German because you have to make choices when you do tests and these seemed feasible to me because of other characteristics of the manuscript (e.g. likely image sources, etc). 

But it remains that to date, I (like everyone else) am unable to find a group of approaches that allows for consistent (reproducible) cipherization of either of these languages into Voynichese.  In other words, there are other aspects of Voynichese that are not solved by the combination of these approaches alone, with the assumptions I have made. Please note that the problems that remain are pretty large and just switching to a different source language isn't going to "fix" these issues.  High among these issues is the issue of the Currier A and B "languages."

That being said, I, too, am quite interested in what Rene has found and hope he will be ready to share his approach at some point -- maybe at the conference?
(22-04-2022, 07:34 PM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.the clearest conclusion from this result is that it is unlikely to the point of impossible for Voynichese to be a 1:1 substitution no matter how many languages/dialects are looked at.
To say today what is possible or impossible is, in my opinion, rather premature.
I will show this soon, well before the conference. 
The problem is that I am still too busy even just with other Voynich stuff.
When combining glyphs, words are getting too sort, so one implication is that Voynich words should no longer be seen as plain text words.
I know what you want to say. I have seen it.
(22-04-2022, 07:34 PM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.I support anyone who is interested in "solving" the text to perform the basic frequency analyses like Searcher.  Nothing is more convincing than looking at the numbers yourself and realizing the issues that are being faced in getting Voynichese (at the glyph level) to look and act like natural language.

Maybe this explanation will help for understanding the issue better…

Great write up, Michelle. This is, as we’d put it over on Reddit, a good ELI5 (“Explain Like I’m Five”).  The thing is, to anybody motivated and intelligent enough to dive into cryptography or linguistics to explore a mystery, the statistical analyses that rule out Voynichese being either a natural human language plaintext or a simple substitution cipher are not rocket science. They can be dry and disappointing, but not nearly as much of a letdown as spending huge amounts of time and energy pursuing a possible solution that was doomed from the start.

I also like how diplomatic your explanation is, which is something I’ve said to/about Koen as well. What you’ve described here is nothing that Nick P, Marco P, and Rene Z haven’t been explaining to noobs for years. All three of these luminaries, however, are wont to express it in a way that hints at a weariness of having to explain it anew so many times, and a frustration that it isn’t more obvious or more common knowledge by now. And after having put so much time and labor into discovering these properties of Voynichese, I get it. But this can be a bit offputting to those minority of starry-eyed noobs who are willing to do their homework. (But just need to know where to start.)
(23-04-2022, 04:20 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.... anybody motivated and intelligent enough to dive into cryptography ...
Let's not confuse our hobbies with academic work.
Pages: 1 2 3 4 5 6 7 8