(20-05-2022, 02:22 PM)Bernd Wrote: You are not allowed to view links. Register or Login to view.If I read the graph correctly thats a plus of over 20%, that's massive and much more than in any other document in the text file you shared on the previous page.
This depends from text to text. In the normal corpus, I get anywhere from a 3% increase to a 16% increase.
It is true that the modified VM text has much bigger h2 gains than anything else. EVA gains something around 10% when removing spaces. As Marco would say (I think) it helps here to think in terms of bigrams. For normal EVA, there will be novel bigrams created when spaces are removed. But this is somewhat limited because for example so many vords end in -n, so the combinatory potential is not unlimited. In my transformed file there is more potential for variety, because for example [in] and [iin] are now diversified.
For example, take "daiin dal" and "dain dal". In EVA, one novel bigram will become [nd], which will be frequent. But in a transformed file, novel bigrams will be [1d] and [2d].
Additionally, while I was replacing n-grams bi single characters, I felt more and more like the potential for further gains was shifting towards the space. Since I preserved word boundaries, this prevented common glyphs at words' edges from being addressed. So removing spaces suddenly had a major impact.
Could it be done like this below?
A Psalm of David (fragment). The heavens declare the glory of God, and the sky above proclaims his handiwork. Day to day pours out speech, and night to night reveals knowledge.
Transformed:
The9 he9 avens dec lar e the 9g9 lory9 of9 God9, and9 9the9 sky above proc laim shis handi work. Day today pours ou tspe9 ech, an9 d night9 tonight9 reve 9als9 know ledge.
Ciphered:
shey chey afedl tem par e she ycfhy poriiny ocphy cfhoty adty yshey lkiin ackhofe cthrom paeeg lcheel chadtee sorm taiin qotaiin cthoeeerl oeee qlcthey emch ady t deeiiin qodeeiiin refe yaply mdos petcfhh
For the sake of comparison between [texts in] different languages it would be methodologically correct to compare differences in h2/h1 figures (instead of those in absolute h2 figures). Since different languages would have different h0, and also h0 itself slightly changes when you remove spaces.
(20-05-2022, 10:45 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.If we assume a verbose cipher and omit spaces as a variable, it is possible to reach normal entropy values for Voynichese.
My idea has been that there is "verbosity", but not in terms of using a verbose cipher known at the time, but just as a side effect of the adopted procedure. I even worked out a fancy term "not-so-verbose-cipher" for this kind, meaning that verbosity is not intended as such.
What I think is worth
much attention is that spaces are oftentimes very carelessly placed, as if it did not matter whether a space is present in a given place or not. If spaces are just side effect of a certain procedure, then removing spaces you lose nothing. As I illustrated with one artificial but practical example in my January posting,
Quote:Thirdly, we suddenly obtain the freedom to be as careless with our spacings as the Voynich scribes are. Indeed, we know that sub-alphabet character sequences will always alternate, so wherever there is a switch from the second sub-alphabet to the first one, we know that this is the plaintext word boundary.
But then there remains that obstacle of labels... any idea of removing spaces (or adding more spaces), mine included, should be somehow reconciled with the labels existence, and that is somewhat challenging.