The Voynich Ninja

Full Version: What is the current status of the transcription?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I’m having trouble with the Bavarian cipher—I’m finding interesting and coherent words that, in terms of how frequently they appear, also match other Middle High German recipe books; actually, a lot of it fits together. But there seem to be certain letter combinations that sometimes work and sometimes don’t.

And that’s when I noticed this example: 9v, line 5. 

[attachment=14890]

I’m concerned with the “cc” and “ee.” If I’m seeing this correctly, I could read the first “cc” as a “tc,” while I could read the second “cc” as a “cz,” and the “ee” as a “ce”... or maybe none of that is correct, but if it is, that would be discouraging—because we have the “a/o” problem, problems with the “Gallows”... R problems... and that would completely turn all the statistical and other analyses upside down.

And so I’m asking myself:

What is the most current transcription, or the most accurate one?

Does it even make sense to ask that?

Is Eva a dead end that thousands of people have walked into, when Stolfi is already doubting it (if I’ve understood correctly)?

I’m a bit at a loss right now...

-------------
In this context, I should also mention this writing exercise: You are not allowed to view links. Register or Login to view.. 0.6.5.jsp?folder_id=0&dvs=1774358703990~945&pid=19073483&locale=pl&usePid1=true&usePid2=true
The transliteration itself is not the main problem IMO. It's the choices you make when parsing and processing the text. Do you group glyphs together? Do you count every single glyph or discard some? Whether you transpose glyphs, split and shuffle words around, skip words, all those choices have greater impact on your end results.
substitution is dead
(26-03-2026, 03:57 AM)RadioFM Wrote: You are not allowed to view links. Register or Login to view.The transliteration itself is not the main problem IMO. It's the choices you make when parsing and processing the text. Do you group glyphs together? Do you count every single glyph or discard some? Whether you transpose glyphs, split and shuffle words around, skip words, all those choices have greater impact on your end results.

I completely disagree with that. It is an extremely important question whether “cc” is read as one letter, two letters, two identical letters, or two different letters. It becomes even more important when ‘cc’ can be identified not only as “ch,” but also as “ch,” “ct,” or “cz”—that is, three letter clusters with different meanings. Are different glyphs the same or not? That changes everything, including the statistics.

Unless one assumes it’s a hoax, but as we know, there is statistical evidence for and against that... so we can’t assume that yet.

oeesordy  Yes, but not other ciphers, such as an absorption cipher. I have at least been able to prove that it could be such a cipher. But for that, I would need clear transcriptions. That is why I am asking for the latest ones...
@JoJo_Jost

This link discusses the grammar and a study where vords before and after a vord.  After some runs the guys discussed and agreed that Latin was a 280 and English 300 a random text was 20, the Voynich was 100.  Post #74 in this link.


You are not allowed to view links. Register or Login to view.
(26-03-2026, 06:54 AM)oeesordy Wrote: You are not allowed to view links. Register or Login to view.@JoJo_Jost
This link discusses the grammar and a study where vords before and after a vord.  After some runs the guys discussed and agreed that Latin was a 280 and English 300 a random text was 20, the Voynich was 100.  Post #74 in this link.
You are not allowed to view links. Register or Login to view.

and? Huh
@JoJo_Jost, to answer your question  Rolleyes the transliterations are definitely full of errors.
These errors are also of several different types. Going from the bottom up.

1) There are bound to be plenty of errors in the actual handwritten text on the pages of the MS. In this, I fully agree with Stolfi. We have no way of quantifying this now. If a solution is ever found, we will know then.

2) The transliterations include an interpretation of what are the same characters and which are different. Since different transcribers used different assumprions, the various files are different in that respect. The STA versions of the transliteration files tend to include all differences that have been noted. (Though not the RF file which eliminates a few of the v101-based differences).

3) Finally, people have 'read' the MS differently, so there are transliteration mistakes as well.

The impact of errors 2 and 3 can be avoided by using the original images of the MS, but of course this only allows you to do manual work, no statistics.
It can also be partly quantified by doing statistics or experiments with different transliteration files in parallel.

An error of type 2 that will not be easy to catch is, for example, if in reality f and p are one and the same. Since these characters are relatively rare, this will not be a major problem, but worse cases could easily be imagined, and may well exist.

Finally a word of caution. If a theory is not working, it may be tempting to blame the transliterations. One should always be realistic in which types of errors one is likely to accept, and which should not be acceptable.
I have found quite a few errors in voynichese.com I don't know if that is because voynichese.com uses an old or flawed version of the transcription or if these errors reflect errors in the latest version of the transcription. I am surprised that there has not been an effort to improve the transcription. It would be nice to have a transcription which incorporated different possible readings such as when there is uncertainty over the presence of a space or not.
(25-03-2026, 10:02 PM)JoJo_Jost Wrote: You are not allowed to view links. Register or Login to view.I’m concerned with the “cc” and “ee.” If I’m seeing this correctly, I could read the first “cc” as a “tc,” while I could read the second “cc” as a “cz,” and the “ee” as a “ce”... or maybe none of that is correct
Counting with You are not allowed to view links. Register or Login to view. there are 10.674 ch and 4705 ee ,
Do you think that all of them need interpretation?
(26-03-2026, 12:31 PM)Apycalops Wrote: You are not allowed to view links. Register or Login to view.Hello Mark,

This is a very interesting point.

We agree that transcription uncertainty (especially regarding spaces and ambiguous glyphs) can strongly affect statistical analysis.

However, in our observations, some structural patterns seem to persist even when allowing for a degree of transcription variability.

For example:
- certain token families remain contextually separated rather than merging into a single interchangeable group
- suffixes such as -dy / -ody continue to show clustering and position-dependent behavior
- these effects appear across different sections, even when considering possible segmentation ambiguity

This makes us wonder whether some of the perceived "errors" or inconsistencies might not come only from imperfect transcription, but also from underlying structural constraints that are not immediately visible.

In other words, even with a more flexible transcription, some regularities might remain, suggesting that part of the system is independent of exact glyph interpretation.

We would be very interested in your view on whether such structural stability could coexist with transcription ambiguity.

My research is focused on the more rare and unusual words as I believe that most of the Voynich text is comprised of filler words, maybe 80%, and the remaining 20% is real text. This is what some might term a stegnographic interpretation. So, I am much more interested in the vords with more distinctive spellings. I published a first draft document on this, but I will publish a more complete document as soon as possible.
Pages: 1 2