The Voynich Ninja

Full Version: Voynichese is a numeric cipher?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
(04-06-2026, 05:57 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Your observations about words could well do with their own thread.
A quick check in Notepad++ on RF1b-er.txt for "ol.daiin" gives credence to this ngram-word unit appearing more often in the Herbal sections
with a second peak in the Pharma section and also a strong absence in the Balneo & Stars sections. Quite interesting, very well observed.

As for this thread, if you could give a clear walk-through example of how this numeric cipher would work, that would be most helpful.(to me at least)
Well, first of all, I apologize for not being able to give an example of how a numeric cipher works (because I just couldn't give it). In this answer, I will try (both for you and for myself) to explain the operation of the cipher used in the manuscript. I should say right away that this example is inaccurate and arbitrary, and it does not prove my theory; it only shows a similar algorithm.
Here's the text to start with:
Hepatica in silvis diffusa est. Brevis rhizoma, flores parvi, caerulei vel purpurei. Folia vide, quia hepatis similes sunt et huic plantae tantum pertinent. Infusio iecoris morbos sanat. (yes, it's Latin, just for example).
First, let's try to shorten the text a bit, as medieval scribes did:
Hepatica in silv’ diffusa est. Brev’ rhizō, flor’ parvi, cerulei vel purpurei. Folia vide, q’ hepat’ simil’ sūt et huic plātē tantū ptinēt. Infusio iecor’ morb’ sanat.
Let's move on to numerical substitution. Our alphabet is: A = 10, B = 11, C = 12, D = 13, E = 14... Z = 35; Let's denote the abbreviations separately: is/us = 91, et = 92, ō/ū/ā = 93 (the dash at the top is considered a separate character).
In encrypted form, it will look like this:
1714231027181210 1821 2618192891 13181515282610 142627. 1125142891 251718232293 1519222591 2310252818 12142528191418 281419 2328252328251418. 1522191810 28181314 2491 171423102791 261820181991 26289327 92 17281812 23191093271493 271021272893 2327182114279318211528261822 181412222591 2022251191 2610211027.(we'll leave the dots for convenience).
Now, let's use the "letters" of Voynichese and assign them arbitrary values of Latin letters: ar = 10, o = 11, d = 12, e = 13, y = 14, s = 15, r = 16, l = 17, oi = 18, ai = 19, or = 20, ol = 21, al = 22, in = 23, iin = 24, ch = 25, p = 26, t = 27, f = 28, k = 29.
Now, let's try to take our text and arrange the numbers in the words in ascending order:
1010121417182327 1821 1819262891 10131515182628 142627. 1114252891 171822232593 1519222591 1018232528 12141418192528 141928 1418232325252828. 1015181922 13141828 2491 101417232791 181819202691 26272893 92 12171828 10141923279393 102127272893 14182123272793. 15181821222628 121418222591 1120222591 1010212627.
We substitute the Voynich's letters:
arardyloiint oiol oiaipf aressoipf ypt oychf loialinch saialch aroinchf dyyoiaichf yaif yoiininchchff arsoiaial eyoif iin arylint oioiaiorp ptf dloif aryaiint arolttf yoiolintt soioiolalpf dyoialch ooralch ararolpt.
(it doesn't look like it yet. Well, we've removed the abbreviations here).
Now, cut all the words into equal parts of 2-4 letters each:
arar dy loiin t oiol oia pfar ar ess oip f ypt oych floi alin chsa ial charo inchf dyyoi aichf yaif yoiin inchch ff arsoi aial eyoi fiin aryl int oioiai orpptf dloif aryaiin arolt tfyoi olintt soioi olalpf dy oialch ooral charar olpt.
If we change the words a little, some words similar to those found in the manuscript will begin to appear.
The result is gibberish, but sometimes you can see standard combinations like arar, aiin, oiin, etc.
What did I mean by that? Well, the example wasn't exactly successful, but I didn't set out to recreate the Voynich cipher; I just wanted to demonstrate how it could be done. By deciphering the resulting set of letters, we would have a "rebus" consisting of anagrams divided into several words. I assume that the encryption in the manuscript was done in a similar way to this example. If you noticed, some of the words in the last text can be changed again, and they will become similar to the Voynich script, such as the ugly "oych" in "choy".
Why is the example inaccurate? First, it is arbitrary and does not seek to reproduce the Voynich. Second, I have placed the gibbets on a par with the other letters, which makes them appear much more frequently, and in the manuscript they are clearly not ordinary letters. Third, I have used Arabic numerals instead of Roman numerals (it was easier to number from 10, but perhaps I could have achieved greater consistency with Roman numerals).
Briefly, In short, the manuscript's cipher should look something like this (it's approximate because there may be unaccounted-for aspects that can't be seen immediately), but the code is based on Roman numerals and is likely to be different from the example (it may start with a different number, but some letters have special numbers).
Not sure if any of this is any use to you, but never know.. 

I played with something a while back, looking now I think I have deleted the files apart from a tiny notepad file. 
But essentially the idea was that if you use 1,2 and 3 you have 26 variants of triplets.  
If we assign Voynich letters into 3 groups (123), and a group of nulls (0), we can start writing stuff. 

In theory it works, it's probably a better "solution" than 99% of Tiktok "solutions" (sadly). 
It also naturally creates some things we see in the text because its so annoying to do, you end up reusing things where you can with slight variance.
But most of the problem is "why and how", the below looks voynich(ish) because I made it do that, it could look really "un-voynich". The system really is 99.9% of it imo rather than just the idea that it could come from numbers. Not that I'm against the idea, it just needs a lot of "why and how". 

So, anyway, if I wanted to say something to you in our secret squirrel code (English Variant), we could use 

111 = a
112 = b
113 = c
121 = d
122 = e
123 = f
131 = g
132 = h
133 = i
211 = j
212 = k
213 = l
221 = m
222 = n
223 = o
231 = p
232 = q
233 = r
311 = s
312 = t
313 = u
321 = v
322 = w
323 = x
331 = y
332 = z
333 = .

(ch/sh considered 1 input)
    '1': set("echgbshxu"),
    '2': set("nrjmlv"),
    '3': set("tkpdfqz"),
    '0': set("oayi"),
 *'13': set ("cXh") - X = tkpf (Benched Gallows)


cTheol oloky olorody shor shol choky
daiin ol cThody otoley cThory
qokey oloroky dy cThy olokody
oteoly cThol shor orol oloky orokody qokey 

You are not allowed to view links. Register or Login to view.
Hopefully that was some use, if only food for thought
Thanks to you both.

As to the 'why' question , personally don't i think it's even a problem. I mean unless the vms is deciphered and reason is written there, then we will never know 'why'.
Sorry, yeah you are right, what I meant was "why" as in "why they made the choices they did". Why was "qokedy" that and not "koqedy" for example, so really its just "how".
(05-06-2026, 11:21 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view.Not sure if any of this is any use to you, but never know.. 

I played with something a while back, looking now I think I have deleted the files apart from a tiny notepad file. 
But essentially the idea was that if you use 1,2 and 3 you have 26 variants of triplets.  
If we assign Voynich letters into 3 groups (123), and a group of nulls (0), we can start writing stuff. 

In theory it works, it's probably a better "solution" than 99% of Tiktok "solutions" (sadly). 
It also naturally creates some things we see in the text because its so annoying to do, you end up reusing things where you can with slight variance.
But most of the problem is "why and how", the below looks voynich(ish) because I made it do that, it could look really "un-voynich". The system really is 99.9% of it imo rather than just the idea that it could come from numbers. Not that I'm against the idea, it just needs a lot of "why and how". 

So, anyway, if I wanted to say something to you in our secret squirrel code (English Variant), we could use 

111 = a
112 = b
113 = c
121 = d
122 = e
123 = f
131 = g
132 = h
133 = i
211 = j
212 = k
213 = l
221 = m
222 = n
223 = o
231 = p
232 = q
233 = r
311 = s
312 = t
313 = u
321 = v
322 = w
323 = x
331 = y
332 = z
333 = .

(ch/sh considered 1 input)
    '1': set("echgbshxu"),
    '2': set("nrjmlv"),
    '3': set("tkpdfqz"),
    '0': set("oayi"),
 *'13': set ("cXh") - X = tkpf (Benched Gallows)


cTheol oloky olorody shor shol choky
daiin ol cThody otoley cThory
qokey oloroky dy cThy olokody
oteoly cThol shor orol oloky orokody qokey 

You are not allowed to view links. Register or Login to view.
Hopefully that was some use, if only food for thought
Well, you've clearly done a better job than I have Big Grin . Thank you very much!
(05-06-2026, 11:21 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view.Not sure if any of this is any use to you, but never know.. 

I played with something a while back, looking now I think I have deleted the files apart from a tiny notepad file. 
But essentially the idea was that if you use 1,2 and 3 you have 26 variants of triplets.  
If we assign Voynich letters into 3 groups (123), and a group of nulls (0), we can start writing stuff. 

In theory it works, it's probably a better "solution" than 99% of Tiktok "solutions" (sadly). 
It also naturally creates some things we see in the text because its so annoying to do, you end up reusing things where you can with slight variance.
But most of the problem is "why and how", the below looks voynich(ish) because I made it do that, it could look really "un-voynich". The system really is 99.9% of it imo rather than just the idea that it could come from numbers. Not that I'm against the idea, it just needs a lot of "why and how". 

So, anyway, if I wanted to say something to you in our secret squirrel code (English Variant), we could use 

111 = a
112 = b
113 = c
121 = d
122 = e
123 = f
131 = g
132 = h
133 = i
211 = j
212 = k
213 = l
221 = m
222 = n
223 = o
231 = p
232 = q
233 = r
311 = s
312 = t
313 = u
321 = v
322 = w
323 = x
331 = y
332 = z
333 = .

(ch/sh considered 1 input)
    '1': set("echgbshxu"),
    '2': set("nrjmlv"),
    '3': set("tkpdfqz"),
    '0': set("oayi"),
 *'13': set ("cXh") - X = tkpf (Benched Gallows)


cTheol oloky olorody shor shol choky
daiin ol cThody otoley cThory
qokey oloroky dy cThy olokody
oteoly cThol shor orol oloky orokody qokey 

You are not allowed to view links. Register or Login to view.
Hopefully that was some use, if only food for thought
Well, the algorithm you proposed turned out to be quite accurate, so I think it's worth considering in the context of the numerical version.
The only thing that can be done in the future is to apply this algorithm to Latin and adjust it to get more accurate results.
I was still impressed by that Smile .
(05-06-2026, 11:21 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view.Not sure if any of this is any use to you, but never know.. 

I played with something a while back, looking now I think I have deleted the files apart from a tiny notepad file. 
But essentially the idea was that if you use 1,2 and 3 you have 26 variants of triplets.  
If we assign Voynich letters into 3 groups (123), and a group of nulls (0), we can start writing stuff. 

In theory it works, it's probably a better "solution" than 99% of Tiktok "solutions" (sadly). 
It also naturally creates some things we see in the text because its so annoying to do, you end up reusing things where you can with slight variance.
But most of the problem is "why and how", the below looks voynich(ish) because I made it do that, it could look really "un-voynich". The system really is 99.9% of it imo rather than just the idea that it could come from numbers. Not that I'm against the idea, it just needs a lot of "why and how". 

So, anyway, if I wanted to say something to you in our secret squirrel code (English Variant), we could use 

111 = a
112 = b
113 = c
121 = d
122 = e
123 = f
131 = g
132 = h
133 = i
211 = j
212 = k
213 = l
221 = m
222 = n
223 = o
231 = p
232 = q
233 = r
311 = s
312 = t
313 = u
321 = v
322 = w
323 = x
331 = y
332 = z
333 = .

(ch/sh considered 1 input)
    '1': set("echgbshxu"),
    '2': set("nrjmlv"),
    '3': set("tkpdfqz"),
    '0': set("oayi"),
 *'13': set ("cXh") - X = tkpf (Benched Gallows)


cTheol oloky olorody shor shol choky
daiin ol cThody otoley cThory
qokey oloroky dy cThy olokody
oteoly cThol shor orol oloky orokody qokey 

You are not allowed to view links. Register or Login to view.
Hopefully that was some use, if only food for thought
I decided to test your algorithm in Latin (yes, it may sound silly, as it was created for English, but it will give us an idea of the algorithm's stability). I tried to encrypt "Hepatica sanat iecur" using your alphabet. Here's what I got:
cThol chor ol or key sheey tochol cThyd cheksheey qokody tocheey cheeor olor sheety ochol qokody cThyp shor om chet okey drokod.
Bold letters indicate nulls, i added them to make the words looks like Voynichese. qokody - whitespace. 
What can I say? Of course, the output contained words that did not follow the rules of Voynichese, such as chekshee, sheet, cThyp and ugly drokod (if you get rid of the initial dr, it will look like okod, which is better). But (and this is very important) the algorithm gave us repetitions in the words cheeor and sheey, and this is important because repetitions are an integral aspect of the manuscript's text.
Also, if we try to "correct" the text we received by swapping the letters and adding nulls, we will get the following:
cThol chor ol or key sheey tochol cThy dy cheksheey qokody otcheey cheeor olor sheety ochol qokody cThy ypar shor om chetey okey dor kody.
I added a couple more nulls and changed the words to make them more appropriate: I split cthyd into cthy dy, cthyp into cthy ypar, and drokod into dor and kody. However, this replacement can lead to information loss, making the decoded text less clear. To avoid this, you can split cthyp into cthy and py, and drokod into dr o kody. By decoding these words, you can easily reconstruct and read them.
In general, the model you proposed gives quite good results...
Thanks! 

A fun thing you can do with it also is to add nonsense to the ends of lines
A Bacon cipher also does this (You are not allowed to view links. Register or Login to view.)
You just don't complete a valid 3 letter string, oiin ody y

On word length / information loss, it doesn't matter as the person deciphering would just count up to 3 valid letters and look up the number ref. A solid line with no spaces is as meaningful as anything else. There's quite a few nice points about a Bacon-like cipher idea that correlate with what we see in the text imo. Also a lot of problems..

On a side note, the Friedman's were big fans of the Bacon cipher (and also put some considerable effort into the VMS text) on their headstone they have a Bacon cipher that wasn't noticed for quite some time. It shows how you can adapt these sorts of ideas. It was added by Elizabeth who passed after William.  

[attachment=15966]

Different fonts are used in the pattern 21211 11212 11212 r(incomplete string at the end)
In the Bacon cipher this is "WFF" - William F Friedman.
(07-06-2026, 10:15 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view.Thanks! 

A fun thing you can do with it also is to add nonsense to the ends of lines
A Bacon cipher also does this (You are not allowed to view links. Register or Login to view.)
You just don't complete a valid 3 letter string, oiin ody y

On word length / information loss, it doesn't matter as the person deciphering would just count up to 3 valid letters and look up the number ref. A solid line with no spaces is as meaningful as anything else. There's quite a few nice points about a Bacon-like cipher idea that correlate with what we see in the text imo. Also a lot of problems..

On a side note, the Friedman's were big fans of the Bacon cipher (and also put some considerable effort into the VMS text) on their headstone they have a Bacon cipher that wasn't noticed for quite some time. It shows how you can adapt these sorts of ideas. It was added by Elizabeth who passed after William.  



Different fonts are used in the pattern 21211 11212 11212 r(incomplete string at the end)
In the Bacon cipher this is "WFF" - William F Friedman.
There is reason to believe that ody may be a garbage element in the manuscript itself. Examples include the words dainod and qotomody, where the combination of letters follows the letters n and m, which are only found at the end of the word.
I also heard from Nick Pelling that the f and p symbols may be garbage symbols in some way (well, as I understand it, Pelling believed that the gibbets were markers for "alphabets" or tables, and that f and p were false markers).
One of his arguments is that these symbols may not have been part of the original cipher alphabet (I don't think I can find this article on his website right now, so I'm just recalling it).
In general, the symbols k (10845) and t (6872) are statistically more common than f (499) and p (1620).
Also, the symbols f and p do not have much "connection" with other stable combinations of symbols, unlike k and t (there are combinations such as kedy, tedy, okeody, oteody, okeor, oteor, okear, otear, etc., but no combinations such as pedy, fedy, peody, feody). Most often they appear close to combinations with one syllable or with another word (Pshy, opchey, pchocthy)
It is worth noting that their role often seems to be service-related (they most often appear at the beginning of a paragraph), a striking example being the botanical section. What is interesting - quite often, if we remove the gallows from the supposed name of the plant, it turns into a common Voynich word: paiin, kshody, fchodaiin, pchodar, pchey. This happens sometimes with k and t: tarar, kcheodaiin.
However, they are quite often used in the astro-, cosmo- and zodiacal sections, so I believe that f and p are most likely non-numeric signs that change the meaning of a word or provide a hint for the decipherer. But I'm not sure...
This is an interesting thread! You might enjoy looking at keys to diplomatic ciphers. This folio for example is a ciphered letter of Francisco Despats which features some familiar looking glyphs like 4 and the H gallows (e.g., You are not allowed to view links. Register or Login to view.). It's a vatican archive document (A.A. Arm. I-XVIII 5026 f.105). Related writings and some keys can be found here: You are not allowed to view links. Register or Login to view. and if you spend much time looking at it you may find yourself down the rabbithole of Occitan cipher discussion over here: You are not allowed to view links. Register or Login to view..



[Image: PA2IROk.jpeg]
Pages: 1 2 3 4