Quote:Method is pretty simple. Throw 3 dices, sum the eyes and divide by two rounding up.
This method would never produce 1 letter words and yet they exist in VM.
Minimum total value of 3 dices is 3, divide by 2 is 1.5, round up is 2.
So you're saying something like this?
1) encode source text verbosely (I'll say, any process that makes the cyphertext longer than the plaintext)
2) write this down separately
3) take dice or coins and do stuff with them to determine where spaces are inserted
4) write the line with these spaces, generated by chance, in the MS
How many spaces does the MS have? Like 20,000?
This implies no correlation between characters and spaces, so (since this correlation is quite strong in the actual manuscript) there must be a further step to alter characters to next-to-space shapes.
Eg y.q must correspond to something entirely different, when the dice roll no space.
This dice rolling idea is as high entropy as it gets, not really close to what is observed.
(05-05-2025, 09:05 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.VM word length famously follows binomial distribution of f(n, 9, 0.5) + 1
The details of the binomial distribution of course depend on the transliteration system. We don't know what makes a Voynich character (EVA:a, iin, aiin?). They also depend on how the many uncertain spaces are treated.
Stolfi observed that there are natural languages whose word length follows a binomial distribution.
Gaskell and Bowern found that spontaneous written gibberish can have a word length distribution with low skewness.
Lots of the glyphs appear as singlets in lists ( You are not allowed to view links.
Register or
Login to view. ) or in the circles ( You are not allowed to view links.
Register or
Login to view. ). I think we can reasonably treat those glyps as single letters. How many glyphs are missing if we combine all those lists? Could that perhaps help to discover capital letters?
(05-05-2025, 10:09 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view. (05-05-2025, 08:06 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.This seems very unlikely. If you examine the writing in the manuscript you will notice that there is a fluency to it. There doesn't appear to be much stop-start in the text. The words in each line broadly keep to the same baseline. The writer isn't putting the pen down after each word to roll dice ( nor for that matter to consult any book cypher or perform any mathematical computation to determine what the next word should be ). In my opinion the writing of each page was done in one sitting in one uninterrupted rush of effort.
The script in principle allows this fluent type of writing. Now this is crucially important, because in that respect it differs from essentially all old ciphers based on invented alphabets, be they mono-alphabetic or poly-alphabetic.
However, in practice it is rarely written in a fluent matter. The baseline jumps up and down irregularly, and the direction of the baseline of individual words is not always very straight either. I am sure that the quality varies throughout the MS.
To see what I mean, just look closely at the words in the first two lines here:
You are not allowed to view links. Register or Login to view.
Many other examples (good and bad) can be found.
If the VMS was copied from a previous MS this would remove weird starting and stopping. also they could have prepared on a chalk board/charcoal/writing in the sand before committing text to the page.
some nice graphs about vord length distribution in this old thread
You are not allowed to view links.
Register or
Login to view.
(06-05-2025, 11:10 AM)Koen G Wrote: You are not allowed to view links. Register or Login to view.So you're saying something like this?
1) encode source text verbosely (I'll say, any process that makes the cyphertext longer than the plaintext)
2) write this down separately
3) take dice or coins and do stuff with them to determine where spaces are inserted
4) write the line with these spaces, generated by chance, in the MS
How many spaces does the MS have? Like 20,000?
I don't want to drift too far off the underlying topic, so I'll just briefly say
* In response to Koen: I agree, but consider labor intensiveness a 2nd order issue compared to determining if such a method can replicate key properties of the Voynich vocabulary. If so, then it's time to worry about labor intensiveness; if not, then it's irrelevant. Your mileage may vary.
* In response to Marco: I was unclear -- I still think some of the spaces (i.e., before 'q', after 'ain/aiin/etc.' -- EVA 'y' is trickier) are mechanically inserted, but it's unclear that adding them on top of the underlying word breaks works to sufficiently shorten the vords and reproduce other properties of the vord vocabulary.
* Given the number of folks who seem inclined to like the verbose cipher idea in a vague, hand-wavy way I would really like to see (in a separate thread, it goes without saying) a vigorous discussion of ideas for addressing specific problems with it such as the word length issue and the "if 'ol' and 'l' (say) are both elements of the cipher, then why don't we see 'oll' in the text?" issue.
Unless someone is inclined to spin off a new thread, I'll leave it at that.
People pointed out some problems and mistakes with my approach. Thank you for the feedback!
I started from basics and did a graph where I extracted words from file You are not allowed to view links.
Register or
Login to view. and computed the length histogram. (relative number of occurrences of a each length). On top of that I put distribution of 3 dice sum divided by 2. This time rounded down so sum of 3 would result to 3/2 = 1.5 => 1 and sums 4 and 5 to 2 etc..
Assuming I did not do any major mistake it could indicate that word lengths was decided by the dice or similar method. Please note that medieval dices were not fully balanced so if this was done by real dices one would expect a skew from a perfect binomial distribution.
Attached the used word list generated from the linked VT0e-n.txt.
(08-05-2025, 05:59 PM)tikonen Wrote: You are not allowed to view links. Register or Login to view.Assuming I did not do any major mistake it could indicate that word lengths was decided by the dice or similar method.
It just indicates that the word lengths still closely follow a binomial distribution. I don't think it gives any specific hints about the underlying method, be it 9 coin tosses, 3 dice, three wheels, roman numerals nomenclator, or many other methods largely compatible with this distribution that have been suggested in the past 20+ years.
If words starting "q" and/or ending "n"/"y" are discounted as mechanically inserted spaces rather than dice rolls, shouldn't they be discounted from the statistics?
Either way, I think if viewing the text like "2 5 7 3 4 8 2 3" this obviously could work. However I can't think of a way to hold it together once accounting for preferences of "words" just at a start-middle-end preference level.