The Voynich Ninja

Full Version: glyph [d] as a substitute for [p] and [f]
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
(05-04-2019, 04:31 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.However, I still believe [d] may be a substitute for [p] and [f]. I believe the explanation may lie in the author's use of alliteration and assonance. The sounds that these letters represent may simply tend to appear very frequently clustered together on the same lines, as an expression of this alliteration and assonance. In this case, [d] may be a substitute for [p] and [f], and still [d] may appear just as frequently or even perhaps more frequently on lines with [p] or [f] than on lines without [p] or [f].

There is no theory that cannot be saved by an additional ad hoc hypothesis. I see Voynich research as a battle against confirmation bias - my own - because it is very easy to get stuck in a cognitive trap. I have my own ideas on what the encryption may be and now that I have a credible cipher that generates good-looking Voynichese without a dictionary I try to look for a refutation, not a confirmation, because I'd rather not spend years working on it if I can reject the idea in two weeks. This is why fallibility is a good thing to have for any theory.

A theory that makes no quantitative prediction that can be checked and can account for almost anything, including randomly generated pseudo-Voynichese necessarily has a very low probability. It may be worth investigating anyway, but one should be aware of the low value of success, as evidence, in this case. I don't remember how to calculate it with Bayes' formula and of course it is impossible to calculate anything when the test includes subjectivity, but you get the idea - I hope. Smile
(05-04-2019, 05:53 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.
(05-04-2019, 04:31 PM)geoffreycaveney Wrote: You are not allowed to view links. Register or Login to view.However, I still believe [d] may be a substitute for [p] and [f]. I believe the explanation may lie in the author's use of alliteration and assonance. The sounds that these letters represent may simply tend to appear very frequently clustered together on the same lines, as an expression of this alliteration and assonance. In this case, [d] may be a substitute for [p] and [f], and still [d] may appear just as frequently or even perhaps more frequently on lines with [p] or [f] than on lines without [p] or [f].

There is no theory that cannot be saved by an additional ad hoc hypothesis. I see Voynich research as a battle against confirmation bias - my own - because it is very easy to get stuck in a cognitive trap. I have my own ideas on what the encryption may be and now that I have a good cipher that generates good-looking Voynichese without a dictionary I try to look for a refutation, not a confirmation, because I'd rather not spend years working on it if I can reject the idea in two weeks. This is why fallibility is a good thing to have for any theory.

A theory that makes no quantitative prediction that can be checked and can account for almost anything, including randomly generated pseudo-Voynichese necessarily has a very low probability. It may be worth investigating anyway, but one should be aware of the low value of success, as evidence, in this case. I don't remember how to calculate it with Bayes' formula and of course it is impossible to calculate anything when the test includes subjectivity, but you get the idea - I hope. Smile

I get the idea and I understand your point. But a philological theory by its very nature tends to be qualitative, not quantitative. And yes, such a theory is very difficult to evaluate by standard simple quantitative tests and measures. Philology is very different from the hard data-based sciences. Scholars like Chomsky attempted for many decades to apply more strictly quantitative "rules-based" programmatic theories such as "generative grammar" to explain the construction of grammatical language in general, but I don't think they succeeded. Human language has historically defied all attempts at complete quantitative "rules-based" programmatic categorization and explanation.

In the case of this thread, I think the process has still been quite useful: I have been required to add an additional hypothesis about the nature of the text.

Regarding the "additional ad hoc hypothesis": I think it is plausible to suggest that the author composed the MS by making a series of ad hoc decisions. This is normal in the composition of written literary texts. So I think it is also plausible to suggest that we may need to make a series of ad hoc deductions in order to decipher the MS text and figure out how the author composed it.

Here is an example of the qualitative evidence I have in mind: the distinctly different functioning of [d] in a very frequent word such as in [chedy] vs. another very frequent word but in a very different form in [daiin]. Furthermore we have the very strange fact of the extreme frequency of [chedy] in Currier B vs. its total absence in Currier A. This suggests to me qualitatively that [d] may have a very different function in these two very frequent but distinctly different words. I believe this qualitative evidence suggests that in [chedy], the [d] is almost optional: the author chose to include it in the Currier B portions of the text, but the author chose to omit it in the Currier A portions of the text. It seems highly plausible to me to suggest that in this context, [d] may well represent a vowel before a final consonant in a grammatical suffix. It all looks very logical for example if [y] is "s": in Currier A the author did not bother to write the vowel before this "s", but in Currier B the author did include this vowel as [d].

In [daiin], however, [d] does not make so much sense as a vowel in this place in such a frequent word. Here it is much more plausible to treat [d] as a consonant.

This distinction between the apparent plausible functioning of [d] in [chedy], in particular its apparently "optional" nature in the word, vs. the apparent plausible functioning of [d] in [daiin], was in fact my original motivation for the hypothesis about the nature of [d] as sometimes a vowel and sometimes a consonant.

"u/v" is a very natural candidate for such a letter. This is language-neutral, but still very natural on a qualitative philological linguistic basis.

Then one considers more language-specific philological and phonological linguistic information, such as the alternation of "v" and "b" in many languages. All of this is qualitative, but that does not mean that it is worth less as evidence than purely quantitative measures.

Returning to [daiin], its frequency is much more plausible if [d] can sometimes represent a series of consonants, rather than if [d] is restricted to only represent "u/v".

I am not aware of any other entirely satisfactory explanation of [d] in these two frequent words: optional in penultimate position in [chedy], but prominent in initial position in [daiin]. If you or anyone else has a better or equally satisfactory explanation of [d] in these two words, I would love to hear it. But I want to evaluate it on qualitative plausible linguistic grounds, not merely on quantitative statistical text analysis.

Geoffrey
I just posted this in my Judaeo-Greek theory thread, but in fact this particular set of evidence is language-neutral and theory-neutral, it is purely about the substitution of the glyph [d] for the glyphs [p] or [f]. Thus I re-post it in this thread as well:

See You are not allowed to view links. Register or Login to view. for this post in the context of the other thread. Also, the previous post there with statistics about each significant Voynich character in medial position after, but not immediately after, a gallows glyph in the same word, also provides very relevant context for the evidence presented here: You are not allowed to view links. Register or Login to view. .

=======

Following up my observation in the previous post about the extreme frequency of medial occurrences of [d] after, but not immediately after, a gallows character in the same word, and the rarity of such occurrences of [p] and [f], I have now researched and can present here much more substantial specific evidence in support of the hypothesis that [d] is a substitute for [p] or [f] in such positions:

I have researched each of the 86 such occurrences of [p] and [f] in the MS, in medial position and somewhere after another gallows character in the same word. In a substantial number of cases, which I present below, such words are apparent doublets of the exact same word with [d] in place of [p] or [f], which usually appears more frequently with [d].

There are three categories of such words: 
  • [d] as a substitute for [p] or [f]
  • [d] as a substitute for [ep] or [ef]
  • [d] as a substitute for [cph] or [cfh]
Many of these words are quite long, and it is striking that they appear in both the form with [p] / [f], and in the form with [d].

The most striking example of all actually has an extraneous [c] after the [f] in that form of the word, but this does not detract from the example:

[qokeefcy] : You are not allowed to view links. Register or Login to view. .
[qokeedy] occurs 305 times

This is not an isolated example. In fact, there are almost two dozen more examples of such words, with a more precise substitution of [d] for [p]/[f], [ep]/[ef], or [cph]/[cfh]. (In two cases only there is [cfhh] for [cfh], or [cphh] for [cph].) Many of the words in question are quite long, and unlikely to occur in such pairs by coincidence or chance. 

Here are the numerous examples of this substitution phenomenon in Voynich MS words:

[d] for [p] or [f] :

[opchepy] : You are not allowed to view links. Register or Login to view. .
[opchedy] occurs 50 times, and 35 more times as part of a longer word

[oteofy] : You are not allowed to view links. Register or Login to view. 1
[oteody] occurs 39 times, and 14 more times as part of a longer word

[topaiin] : You are not allowed to view links. Register or Login to view. .
[todaiin] occurs 9 times, and 11 more times as part of a longer word

[qokopy] : You are not allowed to view links. Register or Login to view. .
[qokody] occurs 9 times

[kolpy] : You are not allowed to view links. Register or Login to view. .
[koldy] occurs 7 times, and 16 more times as part of a longer word

[shckhefy] : You are not allowed to view links. Register or Login to view. .
[shckhedy] occurs 6 times

[chepchefy] : You are not allowed to view links. Register or Login to view. .
[chepchedy] occurs 2 times

[pchofar] : You are not allowed to view links. Register or Login to view. .
[pchodar] occurs 2 times

[cphhofy] : You are not allowed to view links. Register or Login to view. .
[cphody] occurs 2 times, and 1 more time as part of a longer word

[ykofar] : You are not allowed to view links. Register or Login to view. .
[ykodar] : You are not allowed to view links. Register or Login to view. 1

=======

[d] for [ep] or [ef] :

[okcheefy] : You are not allowed to view links. Register or Login to view. .
[okchedy] occurs 25 times, and 42 more times as part of a longer word

[qokeoefy] : You are not allowed to view links. Register or Login to view. 2
[qokeody] occurs 32 times

[qofcheepy] : You are not allowed to view links. Register or Login to view. .
[qofchedy] occurs 8 times

[cthoepain] : You are not allowed to view links. Register or Login to view. 1
[cthodaiin] : You are not allowed to view links. Register or Login to view. .

=======

[d] for [cph] or [cfh] :

[qopchcfhy] : You are not allowed to view links. Register or Login to view. .
[qopchdy] occurs 15 times

[fchecfhy] : You are not allowed to view links. Register or Login to view. .
[fchedy] occurs 11 times, and 28 more times as part of a longer word

[fchcfhy] : You are not allowed to view links. Register or Login to view. 5
[fchdy] occurs 4 times, and 17 more times as part of a longer word

[pcheocphy] : You are not allowed to view links. Register or Login to view. .
[pcheody] occurs 7 times, and 5 more times as part of a longer word

[ckhcfhhy] : You are not allowed to view links. Register or Login to view. .
[ckhdy] occurs 4 times, and 24 more times as part of a longer word

[ykocfhy] : You are not allowed to view links. Register or Login to view. 1
[ykody] occurs 2 times

[pchecfhey] : You are not allowed to view links. Register or Login to view. 6
[pchedey] : You are not allowed to view links. Register or Login to view. .


=======

Geoffrey
Bless you for citing cross references!  Heart
I wish this were standard operating procedure
This is of great interest.

There is really no doubt that the text of the Voynich MS was generated according to some more or less well-defined system. The character set used is the same from beginning to end, and the statistics are similar from beginning to end. The variation between Currier-A and Currier-B is significant, but at a lower level.

It may or may not be a cipher, but the analysis of the text benefits from methods used to crack ciphers.
One of the fundamental principles is that anything that 'stands out' is a potential weakness of the cipher. It is a possible way to better understand the system. To break into it. The fact that the characters Eva-f and Eva-p tend to occur on the first lines of paragraphs is one of these things that stand out. However, this is still not properly understood.

It it were true that:

"Eva f and p on first lines of paragraphs, when used after another gallows character in the same word, are equivalent with Eva-d on other lines"

then we would really have a small crack into the system.

For me, this was very much worth a closer look. This is not yet completed, but I can show a few numbers already. The following is based on the ZL transcription file, You are not allowed to view links. Register or Login to view. .

As already mentioned elsewhere, word spaces in the MS are often uncertain, and in the ZL transcription, uncertain spaces are indicated. One has the choice to treat uncertain spaces either as if they are real spaces, or as if they aren't. These two choices lead to different statistics. For example:

There are 35,901 words in the MS if one discards uncertain spaces ...
and there are 38,631 words if one takes them as real spaces.

Among these, there are 2107 words that include at least one Eva-f or Eva-p, if one discards uncertain spaces,  and there are 2127 if one takes them as real spaces. Either way, just over 5,5%.

The number of words that has at least one of Eva: f,p and in addition at least one of Eva: k,t f,p is very much smaller. 191 if one discards uncertain spaces and 151 in case one takes them as word spaces.

The thesis to be tested therefore concerns a rather small set of words in the MS.

Among the words that have at least one "f/p" and at least one more "f/p/k/t", the case that f/p is the second or later instance is again a subset.

Looking at all of them, I am afraid that I cannot confirm the hypothesis that this second 'f/p' is equivalent with Eva-d.

I can post details later.
Hello Geoffrey,

I see two problems with the equivalence of EVA-p/f and d, and I am not convinced.

First the successful replacement of one or several EVA-p/f with d (or the opposite) is not necessarily evidence that they are the same word spelled differently, anymore than the French words: pain, pin, fin, faim, daim are the same. (The final ain/aim/in sound is the same but the words have nothing else in common.) One must be careful about "tunnel vision" (optimistically interpreting evidence as a success and failing to see other possibilities) that goes hand in hand with confirmation bias. We all fall for it, I am not immune and I am not blaming you.

Looking at the big picture (I imagine it is what RenéZ has done) replacing a glyph (or group of glyphs) by another and getting a valid Voynichese word is a more general phenomenon than just EVA-p/f = d, one that is so widespread that it does seem highly significant. For quantitative assessments, here are counts of how many times a replacement in the set of words from main text paragraphs, of glyph(s) (in the first line of the table), one-by-one or all together, by other glyph(s) (in the first column) produces a different valid Voynichese word. (This was done with my transcription.)

[attachment=2879]

With the TT transcription:

[attachment=2880]

As you can see in the p/f/d lines/columns there are many possibilities. No reason to cherry-pick p/f for d over other candidates, even if the hypothesis of the equivalence of some glyphs is right, which I doubt very much.

In my opinion the set of common glyphs is already quite small: there are probably no homophones at all. I suspect something more devious than homophones and nulls is at play, allowing the substitution of some glyphs and combinations of glyphs only in some contexts, not generally.

EDIT: oops, wrong order in the first line, it's cKh, cTh, cPh, cFh.
Rene and nablator, thank you very much for the data analysis that you present here. I appreciate it very much.

Out of curiosity, let me raise a different question: Just today, I was looking at a set of data that pointed in a somewhat different direction from my own current hypothesis, but I found it interesting nonetheless. Of course I do not want to ignore or reject data that fails to confirm my hypothesis, nor do I want to work only with particular pieces of data that happen to support my hypothesis. I am happy to make revisions and alternations to my hypothesis, if that is the direction that the data point toward.

I was studying word types that begin with [fch]. I only considered word types with no other [p] or [f] in them, so that we may consider only one possible substitution rather than dealing with the complication of another possible substitution in the same word. Also, I did not include words that begin a paragraph, because a paragraph-initial [f] may be merely a pilcrow, not an actual letter representing a sound.

I find 30 such word types with initial [fch]. Now I did find 19 of them with doublets with initial [dch], which I do consider a fairly high number, considering that [dch] is not a very common sequence as a proportion of all occurrences of [d] in the MS. I found more doublets of initial [fch] with [dch], than I did of initial [fch] with [kch] or [lch]. Since in general [kch] and [lch] are more frequent than [dch], this seems significant to me.

However, I must admit I found an even better match too: 26 of the 30 word types with initial [fch], have a doublet with initial [sh]. So it seems to me that this is another hypothesis worth investigating further, although it does not confirm my own theory in its current form. Of course [sh] is a complicated character, since it has multiple forms (the shapes of its top loop) that may or may not represent the same or different characters, letters, or sounds. This complication makes the analysis of data with the character [sh] particularly difficult, it seems to me.

Geoffrey
So I looked in more detail. First of all I should make a correction to my previous post. The numbers 191 and 151 I presented are for the words that have at least one of 'f/p' plus at least one of 'k/t'.

To take it from there, it was a manual effort. Perhaps someone knows a Unix command to 'grep' for words that have at least one 'f/p' plus at least one 'f/p/k/t' (awk??) but I suspect it requires Perl, which I mostly forgot.

Anyway, looking at the case where comma spaces are interpreted as word spaces, I find 179 entries for:
at least one 'f/k' plus at least one 'f/k/p/t'.
These can be split into four groups:
1. f/p occurs before k/t.    (There are 90 of them)
2. f/p occurs before f/p.    (There are 28 of them)
3. k/t occurs before f/p.    (There are 58 of them)
4. There are 3 or more gallows  (There are 3 of them).

Just for fun, here are the three words in the fourth category:
  tdokchcfhy (f52r)
  potchokor  (f76r1)
  psholpchcfhdy (f84r)
All three are at a paragraph start, so the first character might not be a text character.

In general it can be said that all words are rather untypical Voynich words, even though they usually have some recognisable pattern elements.

The words analysed by Geoffrey are types 2 and 3, and rather surprisingly I arrive at exactly the same number: 86 of them.

However, the vast majority of them do not lead to a recognisable Voynich word if the f/p or cfh/cph are replaced by a d.

Just to cite a few examples:
ypchedpy (f41r)  would result in two consecutive 'd's. This word is a case where the 'p' could be nulls.
tshodpy (f44r)
(2 more words with dp)
Two consecutive d's is such an unusual sight. But to my surprise voynichese.com gives 23 instances of You are not allowed to view links. Register or Login to view. of which 13 You are not allowed to view links. Register or Login to view.. For example choddy and koddy. 
However, if out of your 86 words four result in double d, this still seems suspicious.
These double d's are just a fairly clear example. In the original post, out of 86 words, 21 are shown, and these 21 already require 3 different rules.
Pages: 1 2 3 4 5