The Voynich Ninja

This system was inspired by the Curve-Line System (CLS). Brain Cham's full work is here - You are not allowed to view links. Register or Login to view..
A key component of CLS is that EVA: "a" is used to transition Voynich text from curves to lines. I thought of it as a switch from curves to lines.
I believe there are more "switches" and that their function can be defined.

Firstly I map glyphs to one of 3 groups dependant on if they are constructed from backslash \, line | or curve c.
If a glyph is a modified version of this shape I call the modification an addition of a curve.
I find it helpful to think of these assignments as blocks, or jigsaw pieces. Some fit, some do not.
(As adding a curve to a curve results in the same shape, I do not modify them further)

Backslash. Backslash-Curve. Curve. Line-Curve.
[attachment=10530]

# Curve
"e": "c", "g": "c", "b": "c", "s": "c", "h": "c",
# Backslash
"i": "\\",
# Backslash-Curve
"n": "X", "r": "X", "j": "X", "m": "X", "l": "X",
# Line-Curve
"t": "K", "k": "K", "p": "K", "f": "K", "d": "K", "q": "K",

The outlier in the mapping is "d", it belongs in "Line-Curve" due to its functionality in the text but is not constructed from line |.
Double \ is used due to backslash being a functional unit for the software, the eventual processing will consider it a single unit.
I also had to compromise with | and call it "l" due to similar restrictions which you will see below.

"switch"
If you prefer to think of this as a "transition" or "modifier" that is fine, it does not matter.
The core principle is that my "switch" will change what was on the left to a new thing on the right.

# Switch
"o": ">",
# Switch-Backslash
"a": ">\\",
# Switch-Line
"y": ">l", "c": ">l",

In the below image we have a hole in our jigsaw. Filling it is where "switches" come in. From the shape of the hole left, you may be able to guess which switch we need.
"a" ends with the backslash shape, so that would be a perfect fit. "o" would also work as it has no defined end shape. "y" and "c" would not work as they end in a line.

[attachment=10531]

In order to make the code understand how this works "non-conformances" are defined so that miss-matched building blocks are flagged.

Pairs of letters which do not fit together are defined as so.

# Pairings
r"c\\", r"cX", r"cl", r"cK",
r"K\\", r"KX", r"Kl", r"KK",
r"X\\", r"XX", r"Xl", r"XK",
r"\\K", r"\\l", r"\\c",
r"l\\", r"lX", r"lc",

Glyph sequences where a "switch" was used transition from a shape to the same shape are also flagged as non-conforming. We did not transition/switch.

# Sequences
r"c>c", r"c>>c", r"c>>>c", r"c>>>>c",
r"K>c", r"K>>c", r"K>>>c", r"K>>>>c",
r"X>c", r"X>>c", r"X>>>c", r"X>>>>c",
r"\\>\\", r"\\>>\\", r"\\>>>\\", r"\\>>>>\\",
r"\\>X", r"\\>>X", r"\\>>>X", r"\\>>>>X",
r"l>l", r"l>>l", r"l>>>l", r"l>>>>l",
r"l>K", r"l>>K", r"l>>>K", r"l>>>>K",
r">>>>>",
}

The pairs and sequences list complete our "non-conformances".

Some glyphs/pairs require additional mapping.
"ch" and "sh" have been mapped as "c" plus a curve. As so they are "c" which is ">l" plus a curve, all together, so the result is ">K".
"ih" and "ish" have been mapped as "i" (backslash) plus a curve. Which is Backslash-Curve "X"
"e"+ Line-Curve are allowed in a separate process so that outside of this pairing "e" functions normally. It is an interesting combination, and previously I have said that I think it might be some sort of "half-benched" gallows, as I group "d" in with gallows that explains all the "ed" in Currier B. As the text does it, I incorporate it as part of its system, I'm not trying to "win", if score padding was the aim I could use much better tricks.. which leads me to EVA: "l".
"ld" is allowed as a pairing. This was a compromise. Other work I have seen has described "l" as the "joker" or "wildcard", the glyph that can be any shape. I found that for my work allowing just "ld" was good enough. Allowing "l" to "be anything" does bump scores up, but I feel like for the most part it is fairly rigid and performs a normal Backslash-Curve function.

For testing I used "ZL - The "Zandbergen" part of the LZ transliteration effort. v. 3a"
I decided on ambiguous mapping by eye and logged the results. In total I gained +3 conforming words from my choices which may have otherwise been non-conforming.
For the test I used the whole of Q1 excluding f1r. This is so that all of the text was Q1, Language A, Hand 1, Herbal.

In my work I have had 1 eye on Currier B ("ed" being an example) but this system will not work for it. This is aimed at Q1 currently, however I plan on tackling Currier B if this seems at all useful or of any worth to anyone.

I would just like to touch on two pairings which I feel may get brought up.
"yc" and "ys". In my opinion these common pairings are not part of normal Voynich language (amongst others). My system does not touch on "Line As a Functional Unit" phenomenon or such, but I do believe this is most likely at play in these cases. I think this is very clear.

[Image: ycs.jpg]

The results of my test using the above rules is here.

[Image: conf.jpg]

The full list of non-conforming words is here.

You are not allowed to view links. Register or Login to view.

Finally.
I used AI. This was for writing code only, all mapping was done by myself with no input or analysis by AI. This is around 6 months of "work", mostly just thinking and reading others work, code just isn't a skill I have.

I also asked it to produce its interpretation (+ my system name) from my favourite drawing in the manuscript for something to post here.

[attachment=10529]

Thank you for reading, if you have any input I would love to hear it.

If you have written anything even roughly in line with my work, I have probably read it and stolen something. So, thank you! Also specifically to Rene for his transliteration and commenting on my other post where I was trying to put this together along with other ninja members who helped me greatly.

Is it 7% of non-conforming word types or word tokens? Could you give the percentage for both?

This is good stuff. Essentially the observation is that repeating strokes appear very frequently in the words of the manuscript. For instance in the word  daiin there are four consecutive i strokes. In  cheey there are five consecutive e strokes. If you list all words in the ZL transliteration that are longer than 10 characters you will see a lot of repeat e stroke words. Words like  psheessheeor,choctheeey,tcheychedy,toeeedchy.

How to explain all this?

I feel that the author is having a bit of fun, playing with the letters. It is easy and satisfying for anyone to write in consecutive regularity. Try it yourself. You will find writing   daiiiiin,cheeeeey easy. It is psychological stroking to do this. The author seems to be choosing to write in a style that is not too taxing. This feels logical. The author is doodling with words.

This can explain why i and e are the only strokes that repeat.  kk,dd,oo,ll, requiring more movement of the hand, occur very infrequently. One significant occasion when oo is repeating is in the improbable and probably meaningless circular word ooooooooolar in f70r1. Here the author must surely be doing it for fun.

Unfortunately, this does rather lengthen the odds of the manuscript being meaningful. Meaningful no longer being the bookies favourite.

There's also a system to the parts that aren't strokes though. And even without that (pace Timm et al), this whole thing is much too structured to call it meaningless. Are Roman numerals meaningless because they repeat strokes?

(20-04-2025, 09:27 AM)dashstofsk Wrote: You are not allowed to view links. Register or Login to view.Try it yourself. You will find writing daiiiiin,cheeeeey easy.

Then why don't these occur in the VM? There are never more than 4 successive e or i. If you look at possible sequences of minims in Latin or u-like letters in lowercase cursive Cyrillic, longer sequences than that, difficult to interpret, are not unlikely to occur in such a long text.

(20-04-2025, 06:33 AM)oshfdk Wrote: You are not allowed to view links. Register or Login to view.Is it 7% of non-conforming word types or word tokens? Could you give the percentage for both?

There's nothing in place beyond pass/fail on a word by word basis. So if the text contained a line with "lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr lr" for some reason, it would flag each of them as a negative score against the overall score, rather than ignoring it after the first instance.

I could check to see how many non-conforming words repeat, but I don't think it is many at this stage.

EDIT:

ycheor – appears 3 times
ychear – appears 2 times
ychey – appears 2 times

All other words appear just once.

I'd have to figure out how many conforming words repeat to come up with a new %

@You are not allowed to view links. Register or Login to view.

The text does like to repeat stuff at times, but that is more CLS than my system.
The core part of my system is that I think there are "switches" or Stolfi called them "modifiers", he listed these as "a, o and y".
Using this, and further defining what each of these do its possible to follow the patterns of words beyond repeating shapes.

In the below word there are not many repeating shapes and it certainly would not pass CLS scoring.
The key points of the word are "cho", this "o" tells me anything curve based can not be next, it can't repeat (this is opposite to CLS).
The next glyph used is "k", which is made from a line, finishing in a curve. "choko" this next "o" tells me the same thing, we can't have a curve next.
"i" ("chokoi") is next, I probably need to tweak my system a bit as "sh" is next and is valid but actually what I would expect to see is "ish" to match the shape of "i", its interesting this has a correction of some sort, I wonder if they did "sh" then realised it should be "ish", anyway.. that ends in a curve so the next glyph must be a switch/modifier or a curve. They choose "e" "chokoishe". If the last letter was not "e" but "o,a,y" we would know the next thing can't be a curve and using the preferences of each we could make a good guess on what letter it might be.

So while CLS, and also my system in parts incorporate this "eeee" "iiiiiii" stuff, I look at some words more like "aligator" "al-ig-at-or", each sound on the right has a modifier on the left, its just that Voynich is very rigid with what it does and "aligator" would end up something like "olgetar".

[Image: is.jpg]

(20-04-2025, 02:16 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view.In the below word there are not many repeating shapes and it certainly would not pass CLS scoring.
The key points of the word are "cho", this "o" tells me anything curve based can not be next, it can't repeat (this is opposite to CLS).
The next glyph used is "k", which is made from a line, finishing in a curve. "choko" this next "o" tells me the same thing, we can't have a curve next.
"i" ("chokoi") is next, I probably need to tweak my system a bit as "sh" is next and is valid but actually what I would expect to see is "ish" to match the shape of "i", its interesting this has a correction of some sort, I wonder if they did "sh" then realised it should be "ish", anyway.. that ends in a curve so the next glyph must be a switch/modifier or a curve. They choose "e" "chokoishe". If the last letter was not "e" but "o,a,y" we would know the next thing can't be a curve and using the preferences of each we could make a good guess on what letter it might be.

This appears to me to be too complicated. When I look at the manuscript I get the impression that the writing generally has an effortless flow to it, that the author doesn't pause mid-word to think about what should come next. Unique words like chokoiShe are always going to be something of a problem for us to explain, and especially this one given that it also contains the rare character pair iSh.

It is my belief that people are thinking too hard about the manuscript. A simple solution is probably more likely. The author isn't an automaton. The writing isn't always going to follow a fixed formal standard. Mistakes, irregularities, oddities are to be expected. The writing probably does exhibit a lot of the author's writing style, and personal preferences will show. We just need to look for them.

I think we may have wires crossed slightly. Its just a way of documenting what Voynich words do, in general. The "preferences".

I'm not sure I follow "The writing isn't always going to follow a fixed formal standard.", entropy readings show that it is that.

(20-04-2025, 11:17 PM)Bluetoes101 Wrote: You are not allowed to view links. Register or Login to view."The writing isn't always going to follow a fixed formal standard."

By absence of formal standard I mean this, that the alphabet is most probably an invented one of the author, and is probably not used in any other document. There is probably no written guide to say how words must be spelt. And so there will be inconsistency. Moreover because the author clearly intended that no-one else should ever be able to read the manuscript perhaps he was not too bothered if there was some occasional carelessness in the text.

Perhaps also he just did not know how certain words had to be written. As an example consider the English word 'berserk'. If you heard this words for the first time and did not know how to spell it you might be tempted to write 'buzzurk', 'birsirk', 'berzurk', 'bazirk' or some other variation. Isn't this the sort of problem any author is going to face with an invented script?

Bluetoes101

oshfdk

dashstofsk

Koen G

nablator

Bluetoes101

Bluetoes101

dashstofsk

Bluetoes101

dashstofsk