The Voynich Ninja

Full Version: About word length distribution
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
Hi,

VM word length famously follows binomial distribution of f(n, 9, 0.5) + 1.  i.e. expected number of heads when you toss coin 9 times plus 1.
I suspect that what ever encoding scribes used for the text, the word length is independent variable and it was produced with dices.
Method is pretty simple. Throw 3 dices, sum the eyes and divide by two rounding up. Use the result as word length and you get basically the same distribution as VM has.

Dices were well known at the time and this is easy and accessible method to get a random number for the word length while maintaining desired average number of 5 to 6 glyphs.

Pictured my dices.
Those are some cool dice!

My question is: why would they go through this effort?
(04-05-2025, 09:36 PM)tikonen Wrote: You are not allowed to view links. Register or Login to view.Method is pretty simple. Throw 3 dices, sum the eyes and divide by two rounding up. Use the result as word length and you get basically the same distribution as VM has.

Not exactly, but close enough I guess.
I could see a function for it if part of the deception was word length, so; 

"Icouldseeafunctionforitifsaypartofthedeceptionwaswordlengthso;" was then ciphered and split up randomly.

But at the same time, I think the original post is stating the average word length is 5.5, in simple terms?
Academic/Technical modern English documents are around this, so I'm not sure what would make VM word length more likely to be decided via dice (op?). Though sort of rare, there's a fair few single glyphs dotted about, however the system suggested would have a minimum of 2 (3x1 = minimum(3) \2 = 1.5 rounded up to 2) so it would not account for single glyphs.
Hi tikonen,

It would be interesting to know to what extent the distribution of VM word lengths follows the binomial distribution, and how this compares with distributions in known languages.

Do you have more data on this subject?

If VM clearly stands out on this point, whether it's the use of dice or anything else, it does indeed tell us something about the production of the text.
There's just no way they made the entire thing rolling dice at every word. It does match the binomial distribution quite well, but you can arrive at that distribution through deterministic (non random) encodings; see René Zandbergen's trigram encoding system.
This seems very unlikely. If you examine the writing in the manuscript you will notice that there is a fluency to it. There doesn't appear to be much stop-start in the text. The words in each line broadly keep to the same baseline. The writer isn't putting the pen down after each word to roll dice ( nor for that matter to consult any book cypher or perform any mathematical computation to determine what the next word should be ). In my opinion the writing of each page was done in one sitting in one uninterrupted rush of effort.
Why should you roll the dice at all?
You would still have to write the words down, as they are repeated over and over again.
Then you could also take a text and simply subtract a letter.
Rolling the dice is not an option.
(05-05-2025, 08:06 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.This seems very unlikely. If you examine the writing in the manuscript you will notice that there is a fluency to it. There doesn't appear to be much stop-start in the text. The words in each line broadly keep to the same baseline. The writer isn't putting the pen down after each word to roll dice ( nor for that matter to consult any book cypher or perform any mathematical computation to determine what the next word should be ). In my opinion the writing of each page was done in one sitting in one uninterrupted rush of effort.

I'm not sure how you determined that. 

How about an experiment? I left my quills when moving, but when I get a new one, I will create a few inscriptions, some of them in one go, some of them with a pause after each word, and some of them even writing words non sequentially. My expectation is no-one will be able to tell them apart after they dry out. I'm a complete amateur with only a little experience with quills, but as far as my experience goes, writing with a quill will cause somewhat unpredictable ink density, but this depends on the writing style and how often and how deep you dip the quill into the ink, which in turn depends on the viscosity of the ink, the quality and size of the quill and one's technique. All other possible effects can be masked by this basic irregularity. 

As for keeping the baseline, I think this is a basic skill of a professional scribe. And, frankly, the baselines in the Voynich MS do get wobbly quite often. On the other hand, the writing itself is so tiny, this is not something unexpected. I did an experiment a few weeks ago, when I tried writing Voynichese to scale (just with a pencil). I really struggled initially, because most strokes are 1-2 mm long. I think the majority of people studying the manuscript imagine the writing about 2-3 times larger than it actually is.
VM word length famously follows binomial distribution of f(n, 9, 0.5) + 1

The details of the binomial distribution of course depend on the transliteration system. We don't know what makes a Voynich character (EVA:a, iin, aiin?). They also depend on how the many uncertain spaces are treated.

Stolfi observed that there are natural languages whose word length follows a binomial distribution.

Gaskell and Bowern found that spontaneous written gibberish can have a word length distribution with low skewness.
Pages: 1 2 3 4