The Voynich Ninja
[split] Verbose cipher? - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: [split] Verbose cipher? (/thread-3356.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13


RE: [split] Verbose cipher? - geoffreycaveney - 20-09-2020

(19-09-2020, 09:49 PM)RenegadeHealer Wrote: You are not allowed to view links. Register or Login to view.
(19-09-2020, 02:42 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.I want VCI transcription to be a tool that all Voynich researchers can use. The idea of [f]/[p] as a variant of [d] rather than of [k]/[t] had come to feel like a "pet" theory of mine last year, so I don't want to impose such a minority opinion of mine on a transcription system designed for general use.

Thanks for clarifying. I guess I didn't realize that you custom built the idea [f]/[p]=[d] in service to your Judeo-Greek theory. I thought it was an idea with some merit, independent of any idea of what the language might be, or even whether Voynichese is language at all.

I mean, I tried to pursue the [f]/[p]=[d] idea independently, but since it was already part of my Judaeo-Greek theory, it was probably difficult for me to remain completely objective about it. There may still be something to it after all. 

Quote:Last winter I was actually inspired by your idea to design a preliminary test of the hypothesis that [p] is a top-line-only equivalent of some other Voynichese glyph or string of glyphs. I describe the test here:
You are not allowed to view links. Register or Login to view.
I'm going to run it when I get over the hump of learning to work with and statistically analyze VMs transcriptions, and post the results when I do.

That sounds interesting! Let us know what results your test produces. 

Quote:On the subject of a tool for general use, what do you foresee other VMs researchers using your VCI system to do? I can tell you what I plan on using it for: cribbing. One of these days I'm going to pick out some strategically placed lines of Voynichese that have been of interest as potential crib sentences due to their context with the imagery, and try using your tool to convert it to the graphemes you suggest. I'll then see if I feel the slightest activity in my hippocampus — the smallest sensation of "this feels vaguely familiar", and trying to place it. If this is an actual promising lead, and not just wishful thinking with too many degrees of freedom, it should soon lead to more Voynichese text becoming comprehensible. If nothing I transformed with your (or anyone else's similar) tool made me say "aha", or none of the "aha" moments swiftly led to bigger "aha"s, then I'd have no further use for your tool. Is that more or less how you were envisioning others using your tool?

I think it's always worth taking a shot at cribbing with any new and plausible transcription system. You never know what you will find. But I see VCI as having a more general usefulness beyond that. I think it can be the equivalent of the EVA transcription system, for those who want to look at the ms text as the possible result of a verbose cipher. In the examples of Voynich lines written in VCI that I have presented here, I think some folks have recognized that this may be an interesting and worthwhile new way to "read" the text. People can do statistical analysis on VCI values, just as they have for decades on EVA values. Who knows, maybe someone will discover something new this way. This may have merit quite apart from whether the VCI values are actually accurate phonemic representations of the actual original text or not. 

Quote:
Quote:seems clear that the general working assumption of most Voynich researchers has been that [f]/[p] are most likely variants of [k]/[t], so VCI is faithful to that consensus.

What's your source for this? I'm surprised to hear you say this, because it's my understanding that this is far from a consensus. It's not an uncommon opinion, but it's not supported by the occurrence of all gallows (and [d]) together on some lines. A good example is You are not allowed to view links. Register or Login to view. line 12.

Hmm, I guess I should say it's a more common opinion than any other that I'm aware of. At least, I'm not aware of any other hypothesis that attracts a significant amount of support. We all know that a tell-tale giveaway sign of obviously bad theories and decryptions of the Voynich ms is equating a common phoneme exclusively with EVA [p] or [f], as if that phoneme could only appear in the top lines of paragraphs and nowhere else. So there has to be some other explanation. When I proposed my [f]/[p]=[d] idea last year, Rene for example criticized it quite harshly, and dismissed the results of my analysis as statistically insignificant. If he and others don't think [p] and [f] can represent distinct phonemes, and they don't think they can be the equivalent of [d], then I can't imagine what other plausible hypothesis there could be besides [p] = [t] and [f] = [k]. 

Quote:
Quote:At this stage I don't want to force too many such n-gram=single unit equivalences into the system. The system already has plenty of them. As it currently stands VCI can read [dch] as <ki>, and I don't want to get rid of any more vowels than I have to. Of course if the text ends up being Arabic or Maltese, then maybe we don't need all those vowels. But if it ends up being Czech or Irish or Basque, then we should expect to see a normal amount of vowels. Also, it is easy enough to treat <ki> as <kj> at a later stage, if we want to go in that direction.

Fair enough. I'm just curious, and looking to get more insight into how you built the model. I'm sorry if I missed this, but based on Koen and Marco's work, did you define numerical parameters for where you made the cutoff of which ngrams to treat in your model as single units, versus which ones to continue treating as strings of n units each? If this were my project I probably would, just to make my tool more user-friendly and transparent.

I think I mentioned before that unfortunately statistics was not my area of specialization when I studied mathematics at an advanced level. So I did not dig into Koen's work in such detail as to identify those numerical parameters that you refer to. I accepted the final best results of Koen's work as presented on his blog, and used those results to develop VCI as much as I could. I made a few other tweaks for the sake of the internal linguistic (phonological) consistency of the system, but I tried to stick to Koen's results as closely as possible. Koen left [dch] as two separate units, so I did as well. 

Quote:
Quote:I think EVA [y], which is also VCI <y>, is a very important and complicated glyph in the ms text. I would not want to rush to equate it with EVA [a] and potentially lose essential distinct information that [y] actually contains and represents. If in the end [a] and [y] do prove to be equivalent, it will still be possible to detect that at a later stage in due time: "Linguy Latiny per se Illustraty, Pars I: Familiy Romany" is not a difficult cipher step to figure out. But if we equate them now, and [y] proves to be distinct, it will be more difficult to recover that distinction if we are all using a system that treats them and presents them as identical.
 

While we're at it, since I asked you for a source about the ideas [f]=[k] and [p]=[t], I should mention that my source for the idea [a]=[y] is Emma May Smith's blog. Like the equivalences your model allows, this one is controversial, and not at all consensus. But Emma gives some tantalizing clues that it might be true. I only mention this because I'm going to be keeping this possible equivalency in mind while trying out your tool, and I know you're on the lookout for possible glyph equivalencies to make your model better.

I think Emma's [a]=[y] proposal is a really interesting idea. I'm not convinced that it will work, but it's really interesting. 

By the way, this is a more complimentary comment than anything Emma has ever said about any of my ideas on this forum, ever. 

Quote:
Quote:Yes, I think I said in the post accompanying the VCI tables that [al] was the most difficult decision. The system is more internally consistent if EVA [al] = VCI <as>. But treating the bigrams [ol], [or], [al], [ar] as each representing a single unit was part of Koen's method in generating the 3.01 conditional entropy value for the ms text, and I aimed to have the VCI system respect that method as much as possible. I couldn't find any consistent way to force the treatment of [ar] as a single unit, so I let that go as <al>. (Please note that I was not even remotely thinking of Arabic when I made this decision--if anything, I was thinking of Slavic past tense verb endings!) But out of respect for Koen's method and raising conditional entropy, I forced EVA [al] = VCI <a> in the spirit of his verbose cipher analysis.

Thanks for taking the time to indulge my "couple" of questions, Geoffrey. I think the most challenging part of attempting to reverse engineer an unknown system for encoding human phonemes, is determining the right amount of detail and complexity to add before trying it. Add too little, and the system has too much ambiguity and too many degrees of freedom. Add too much, and chances are a lot of the details are wrong and will lead you astray. Also, spend too much time and energy building a model, and it's easy to get too emotionally invested in your masterpiece to accept it not working.

Yes, I agree, these are three very serious challenges confronting any research and investigation of the Voynich ms text!


RE: [split] Verbose cipher? - Aga Tentakulus - 20-09-2020

   
Now an alphabet is slowly emerging.
For some it is questionable whether it should be evaluated as a single letter. The characters 4 and 9 do not hold the position that one should consider them as such.
Except "Q".


RE: [split] Verbose cipher? - Aga Tentakulus - 20-09-2020

    There is still the question of whether such variants should fill the alphabet or be regarded as a combination.


RE: [split] Verbose cipher? - Koen G - 20-09-2020

Mr Tentakulus, can you please make an effort to stay on topic in threads? If there is something else you like to talk about, please make a new thread.


RE: [split] Verbose cipher? - Emma May Smith - 20-09-2020

[Nevermind.]


RE: [split] Verbose cipher? - ReneZ - 20-09-2020

Back on the topic of this thread, the work by Koen and Marco was of course intended to see how one can bring the entropy of the MS text to a level compatible with known plain texts by combining characters, i.e. trying to identify components of a verbose cipher. In the course of that, they had many dozens of different 'alphabets', as the plots at Koen's blog shows.

As far as I recall, this iterative process did not include the option to 'undo' earlier combinations as a next step. If that would be allowed, the number of different 'alphabets' grows even further. Another thing they did not do was to consider certain characters as equivalent. Doing that really explodes the number of possibilities.

If one would allow all that, indeed a genetic algorithm seems like the most suitable approach.

Now coming to the VCI 'alphabet', this is not a transcription / transliteration, but an interpretation, and it is based on just one of the above-mentioned myriad of possible 'alphabets', all of which are equally (im-)probable.
It is based completely on speculation.


RE: [split] Verbose cipher? - nickpelling - 20-09-2020

(19-09-2020, 10:17 PM)Emma May Smith Wrote: You are not allowed to view links. Register or Login to view.By "presence of common groups" do you mean the bigrams themselves or that such bigrams are meaningful units?

I mean that these bigrams are well known features with very high instance counts. It's not hard to notice that these groups are rarely found reversed, e.g. if you remove the 5507 instances of ol you're left with 175 lo instances. Similarly, qo = 5186, oq = 2.


RE: [split] Verbose cipher? - Koen G - 20-09-2020

Rene, that's basically it, especially the way I had intended the experiment initially. With Marco's input, I ended up with a smarter approach, checking for those replacements that increased h2 without ruining h1. This ended up selecting for replacements including, [o, a, i] but excluding [e,y], something I don't have an explanation for. 
So in short, the transformations that are included are based on:

- Marco's frequency list of n-grams
- Whether or not h2 increases more than h1

All of this was done with lots of trial and error. It is almost certain that a more optimal path exists. It is definitely certain that Voynichese was not meant to be transformed in this exact way, there are too many inconsistencies in the system and too many factors of Voynichese that remain unaccounted for. I just picked a metric and arbitrary limits, so yes, there are countless ways to do this.

For further research I would really like to find out why the "system" selected against certain glyphs. This is probably well beyond my capabilities though.


RE: [split] Verbose cipher? - ReneZ - 20-09-2020

(20-09-2020, 01:46 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.It is definitely certain that Voynichese was not meant to be transformed in this exact way

I am not so sure of that. The source text, if there was any, was a handwritten one, and if the verbose 'encoding' was based on the handwriting details, there would be all sorts of special effects.


RE: [split] Verbose cipher? - geoffreycaveney - 20-09-2020

(20-09-2020, 01:02 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Back on the topic of this thread, the work by Koen and Marco was of course intended to see how one can bring the entropy of the MS text to a level compatible with known plain texts by combining characters, i.e. trying to identify components of a verbose cipher. In the course of that, they had many dozens of different 'alphabets', as the plots at Koen's blog shows.

As far as I recall, this iterative process did not include the option to 'undo' earlier combinations as a next step. If that would be allowed, the number of different 'alphabets' grows even further. Another thing they did not do was to consider certain characters as equivalent. Doing that really explodes the number of possibilities.

If one would allow all that, indeed a genetic algorithm seems like the most suitable approach.

Now coming to the VCI 'alphabet', this is not a transcription / transliteration, but an interpretation, and it is based on just one of the above-mentioned myriad of possible 'alphabets', all of which are equally (im-)probable.
It is based completely on speculation.

The most significant divergence of VCI from Koen's final (best) verbose cipher analysis that I can see is treating EVA [ar] as two units rather than as a single unit. Yes, that was an element of interpretation. Also, equating EVA [qo] and [qok] was a significant adjustment. Other differences appear to me to be mainly cosmetic, and can be transparently adjusted as necessary: It is a simple matter to express EVA [lch] as <sj> rather than <š>, EVA [rch] as <lj> rather than <l'>, etc. I made it very clear that I had an objective reason for treating all [ch] ligatures and glyph sequences in VCI in the manner that I did, motivated by the very low [och] / [ch] ratio in comparison with [ockh] / [ckh] and [octh] / [cth]. But again, it is simple enough to express EVA [lch] as VCI <sj>, [rch] as <lj>, etc., and they are exactly in line with Koen's best version. 

In many cases one can add the lower case vs. capital distinction if desired where VCI equates two glyphs or sequences: For example, one could express EVA [r] as VCI <l> but EVA [or] as VCI <L>, and EVA [a] as VCI <a> but EVA [al] as VCI <A>, etc. This makes it very convenient to treat the two glyphs/sequences as either identical or distinct, simply by toggling case-sensitive vs. not case-sensitive, as I have already aimed to do with EVA [p] = VCI <P> and EVA [f] = VCI <T>, etc. 

Of course others are welcome to propose their own transcriptions based on Koen's or others' verbose cipher analysis. Whether it's VCI or another version, I think it provides an interesting different way to look at and think about the ms text.