Could it be that most schemes of digitization are not creating a meaningful way to study consonants and vowels in Voynichese? I attached what seems to me the best way to arrange the symbols in a concise manner to identify what the building blocks of the script are. Can you see where the Table leads? My idea is that the italic 'I' strokes that appear in most of the characters are a way to count the n-th vowel for a basic consonant that is carefully chosen for the glyph. What are your thoughts on this?
Hi,
the low character entropy of Voynichese contradicts the expansion approach that you propose for decoding the frequent glyphs
y,
l and
d. For instance,
dy occurs in ~18% of tokens; there is no language in which "sama" or any other quadgram occurs with such a high frequency.
See also You are not allowed to view links.
Register or
Login to view..
Rene Zandbergen Wrote:Compression of a string of characters means an increase in entropy, while the inverted process (de-compression of the compressed file) a decrease in entropy. Thus, a process converting a plain text with higher entropy to the Voynich MS text is equivalent with some kind of expansion. Now, replacing Voynich MS characters with further expansion will not increase the entropy, but rather reduce it (8). Because of what has just been described here, the Voynich MS text has occasionally been called (or compared with) a verbose cipher (9).
(19-02-2023, 08:04 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.For instance, dy occurs in ~18% of tokens; there is no language in which "sama" or any other quadgram occurs with such a high frequency.
With all due respect, my proposed scheme is not a way to render information, it is a scheme to render sound. What can you say to repeating 'I' characters with regards to entropy? There are languages like Arabic where vowel sounds do not even appear, so you can only see long consonant chains, and if you transliterated Arabic with only consonants in the target language, you couldn't pronounce anything of it.
And, in case Voynichese is a conlang, which couldn't be dismissed easily, the rules about natural languages wouldn't apply and the composer of the language will have to use a subset of all that's possible in the envisioned language, which would favor some digrams or trigrams or quadgrams way much more than statistically expected.
Just look at a recipe book, and statistically study how many times the word spoon appears. The word spoon would never appear with the same frequency in Wikipedia. The Voynich MS can't be considered for Big Data Analysis in my opinion. Rather, forensics would best suit it.
In the attached file, I studied the word ending quadgrams according to chances the combinations C1VC2Y could appear. The nodes of the graph are the consonants C1 and C2 that are prevalent. The links in the graph are symbolized by the vowel V, when Y is considered of potential grammatical use, so not necessarily part of a meaningful root of the language.
Any need for clarification concerning the attached graph?
(19-02-2023, 08:04 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.Hi,
the low character entropy of Voynichese contradicts the expansion approach that you propose for decoding the frequent glyphs y, l and d. For instance, dy occurs in ~18% of tokens; there is no language in which "sama" or any other quadgram occurs with such a high frequency.
You come to conclusions with a sample size of three glyphs compared to around 35-ish potential string patterns that appear in FSG. Is it worthwhile to jump to conclusions on the value of the entropy without actually calculating it?
By the way, if my approach is right, it might shed light into potential errors in the Manuscript, so I wouldn't jump to conclusions like you did.
And more on the -sama ending: it's high frequency of usage is mainly due to one Quire, so it's potentially playing a very specific functional role and is not representative of Voynichese as a whole.
Hi Arichichi,
The transcript designs are usually primariliy aimed at convenience of machine analysis; for a machine it's virtually all the same which glyphs represent vowels or consonants. Next, there is the chicken & egg problem: how does one detect vowels and consonants without first running some kind of machine analysis? But then how does one run a machine analysis without first designing a transcription alphabet?
(22-02-2023, 06:18 PM)Arichichi Wrote: You are not allowed to view links. Register or Login to view.And more on the -sama ending: it's high frequency of usage is mainly due to one Quire, so it's potentially playing a very specific functional role and is not representative of Voynichese as a whole
This is not true. The balneological quire does go to town on
dy but we see high percentages of it elsewhere, particularly by Scribe 3 (under Lisa Fagin Davies's differentiation of the scribes). Any solution needs to explain this, and why it varies according to scribe and position in the paragraph.
Quote:You come to conclusions with a sample size of three glyphs compared to around 35-ish potential string patterns that appear in FSG. Is it worthwhile to jump to conclusions on the value of the entropy without actually calculating it?
By the way, if my approach is right, it might shed light into potential errors in the Manuscript, so I wouldn't jump to conclusions like you did.
Being defensive isn't going to make people keener to engage with your theory. Your theory needs to try to account for how predictable those three glyphs (and others) are, and account for the other distinctive Voynichese behaviours that have contradicted all attempts to assign sound values to glyphs. Otherwise, there's no meaningful difference between it and the hundred other theories.
[
attachment=7212]
The point is to find an explanation.
Again.
If one hand were to write the text mainly in the second person, the number of endings used would inevitably increase. These remain constant.
If narratives were written in the first person, the picture would be completely different.
If I start from the frequency of endings in the VM, the 2 obverse endings are "xx-iin" and "xx-g" (xxtis + xxus/um).
Now the application of the PC programmes should shed more light on the picture.
Only endings should be used here, therefore single "8g" combinations should not be considered.
If a hand had many more of the endings, it can be assumed that he is writing in the second person. But this also gives an indication that it is not a thrown-together text, since he is already working according to a rule.
The dialect form may allow some leeway, but rules apply here too.
(23-02-2023, 02:22 AM)tavie Wrote: You are not allowed to view links. Register or Login to view.The balneological quire does go to town on dy but we see high percentages of it elsewhere, particularly by Scribe 3 (under Lisa Fagin Davies's differentiation of the scribes).
Scribe 3? Not true, unless your percentages are calculated differently than mine.
% of words? Or % of coverage of the page by the pattern
dy?
(23-02-2023, 10:47 AM)nablator Wrote: You are not allowed to view links. Register or Login to view. (23-02-2023, 02:22 AM)tavie Wrote: You are not allowed to view links. Register or Login to view.The balneological quire does go to town on dy but we see high percentages of it elsewhere, particularly by Scribe 3 (under Lisa Fagin Davies's differentiation of the scribes).
Scribe 3? Not true, unless your percentages are calculated differently than mine.
% of words? Or % of coverage of the page by the pattern dy?
As a word final, rather than as a word in itself (which Scribe 3 does dislike).