| Welcome, Guest |
You have to register before you can post on our site.
|
| Latest Threads |
No text, but a visual cod...
Forum: Theories & Solutions
Last Post: Jorge_Stolfi
1 minute ago
» Replies: 1,753
» Views: 1,167,885
|
15thc perception on swall...
Forum: Imagery
Last Post: Pierre Dumont Himself
1 hour ago
» Replies: 409
» Views: 279,562
|
Studies on De Balneis Put...
Forum: Imagery
Last Post: Jorge_Stolfi
1 hour ago
» Replies: 31
» Views: 2,457
|
A One-Page Ledger Method ...
Forum: Analysis of the text
Last Post: Dunsel
1 hour ago
» Replies: 152
» Views: 6,400
|
Month names collection / ...
Forum: Marginalia
Last Post: Koen G
3 hours ago
» Replies: 213
» Views: 48,287
|
The oddities of the bigra...
Forum: Analysis of the text
Last Post: Dunsel
4 hours ago
» Replies: 4
» Views: 99
|
Voynich is encrypted ENOC...
Forum: Theories & Solutions
Last Post: Radim Dobeš
5 hours ago
» Replies: 141
» Views: 15,595
|
Who is even still working...
Forum: Theories & Solutions
Last Post: DG97EEB
6 hours ago
» Replies: 22
» Views: 1,015
|
My Solution – Abbreviated...
Forum: Theories & Solutions
Last Post: oshfdk
8 hours ago
» Replies: 27
» Views: 941
|
Notes on Four Sources Cit...
Forum: News
Last Post: Torsten
10 hours ago
» Replies: 0
» Views: 131
|
|
|
| Pastebin claiming to decode Voynich Manuscript |
|
Posted by: Strategeryist - 06-07-2025, 01:19 AM - Forum: Theories & Solutions
- Replies (8)
|
 |
Hello everyone,
I know nothing about voynich or its translation. I randomly found this pastebin on the internet while going down an interesting unrelated rabbit hole. Posting it here because I'm curious if its legitimate or not as I am not an expert in this topic. It was uploaded anonymously on Nov 29 2022 and has only received around 158 views. It was removed March 11, 2025 and is now only available on the wayback machine. In addition to the link, I've also attached the txt file.
The Voynich Manuscript DECODED.txt (Size: 35.48 KB / Downloads: 23)
You are not allowed to view links. Register or Login to view.
Below are some interesting paragraphs from the pastebin.
> One thing the Voynich Group talks about continually is that they believe this manuscript is encoded. Great lengths have been taken to find the code and decipher the words. After having looked over page one of this manuscript, I felt that the group was right in one aspect. It was encoded. But not by the logical method, the mathematical method of encoding they were looking for. It was visually encoded. At least as far as I have gone with it.
> For instance, that word AIN. I thought about the fact that if AIN was Hebrew for the eye, then to visually encode it the author could also write AIIN and AIIIN. Even though on the surface it would logically appear that these variations represented different words, they were all actually the same word, the Hebrew AIN, the eye. When the letter D was added, as DAIN, it represented the construction D'A, the Hebrew word "knowledge", spelled daleth ain. You really have a two letter word, not a four letter word. When he changed 'A (the letter ain) for the Latin letter O, OIIN, it would show the writer was possibly accessing another definition of AIN, such as "to look at." The Latin letter O is derived from the Hebrew letter 'Ain.
...
> Line by line updated paraphrase translation of page 1 of the Voynich Manuscript:
Seeing, my friend, as there is power that will prevail \[the power would have to refer to the power of the devastation of the eye\], be wakeful. It is the appointed time of the Eye. Know also that opponents, deniers, have been allowed by men. They are now the head evil faction. All that are precious to them will be killed.
|
|
|
| GPT Models Fail to Find Language Structure in the Voynich Manuscript |
|
Posted by: quimqu - 02-07-2025, 04:10 PM - Forum: Analysis of the text
- Replies (27)
|
 |
The GPT model is a type of neural network trained to predict the next token (e.g., word or character) in a sequence, based on the context of the previous ones. During training, the model gradually learns patterns and structures that help it guess what might come next. When applied to natural language, this often results in learning the grammar and syntax of the language. The goal of this experiment is to see whether a GPT, when trained on Voynich text, can reproduce even short valid word sequences — a basic sign of underlying grammatical structure.
Using a minimal GPT architecture trained on 11,000-token corpora from natural languages and the Voynich manuscript (EVA and CUVA transcriptions, only paragraphs of what seems natural language (not cosmological parts, for example)), I evaluated how well the model could reproduce sequences of two or three consecutive words (bigrams and trigrams) from the original corpus. The results reveal stark differences between Voynichese and natural languages.
I trained several nanoGPT models (roughly 1.1M parameters each) on corpora limited to 11,000 words each. The corpora included:
- Latin (e.g. De Docta Ignorantia)
- Classical religious text (In Psalmum David CXVIII)
- Early Modern English (Romeo and Juliet)
- Esperanto (Alicie en Wonderland)
- Voynich EVA transcription
- Voynich CUVA transcription
Each model was trained on tokenized text split by the dot (".") separator, treating each token as a "word". Then, I prompted each model to generate 1000 words, starting from a random token from the original corpus.
For each generated sequence, I extracted all bigrams and trigrams and checked how many were present in the original corpus text (used as training data).
Results (Bigrams and Trigrams Found in Training Text):
![[Image: s5N9a0a.png]](https://i.imgur.com/s5N9a0a.png)
The Latin religious text In Psalmum David CXVIII had pretty low bigram and trigram scores — not too far from the Voynich numbers. This could be because of its complex sentence structure or how rarely some word combinations repeat. But even then, it still produced some consistent word sequences, which the GPT picked up.
That didn’t happen with the Voynich at all — no three-word sequences from the original text were ever regenerated. This makes Voynichese stand out as fundamentally different.
In addition, the entropy of word distributions was comparable across corpora (~8.5 to 9.6 bits), meaning the GPT learned the relative frequencies of words quite well. However, only in natural language corpora did it also learn statistically consistent co-occurrence patterns.
Conclusion:
If the Voynich manuscript encoded a natural language, we would expect a GPT trained on it to be able to reproduce at least a small proportion of common bigrams and trigrams from the training corpus. This is exactly what we observe in natural language corpora (e.g. Esperanto 25.9% bigram match). In contrast, the bigram match rate for Voynichese is nearly zero, and trigrams are entirely absent.
This strongly supports the hypothesis that the Voynich manuscript is not a natural language encoding. While it has an internally consistent lexicon (i.e., words), it lacks the sequential dependencies and word-to-word transitions that characterize even simple or constructed languages.
Implication:
If a small GPT can learn bigrams and trigrams from natural languages in just 11,000 words — but completely fails to do so with Voynichese — this suggests that the manuscript does not reflect natural language structure.
This casts serious doubt on claims of direct decryption or translation into real languages. It’s likely that such efforts are misapplied.
Instead, the Voynich may reflect a pseudo-linguistic system — a generative algorithm, a constructed gibberish, or even a cipher whose output was never meant to carry true semantic depth. The surface form may resemble language, but its internal statistical behavior tells a different story.
In short: be skeptical of anyone claiming to have “translated” the Voynich into English, Latin, or any other language — unless they can show that their version has the statistical fingerprints of a true linguistic system.
|
|
|
| About this Subforum |
|
Posted by: Koen G - 30-06-2025, 03:13 PM - Forum: The Slop Bucket
- Replies (26)
|
 |
A conversation I had with ChatGPT
Me:
I moderate a forum about the Voynich Manuscript. As a community, we welcome all people who are interested in the manuscript, regardless of their background or level of expertise. In the past, we have also always been welcoming people to share their theories - with the caveat that they should expect criticism. (As you know, the manuscript remains unsolved, and most theories are demonstrably misguided).
Lately though, we see increasing amounts of AI-assisted theories. People talk you, ChatGPT, specifically, and since you are built to please, you give them the Voynich solutions they want to hear. But we are the ones who have to deal with the consequences. Your training data does not include the solution to the Voynich Manuscript. Quite on the contrary, you are in part trained on nonsense theories of former theorists (along with more sensible data). Previously, people had to work hard to build a theory; and even if they were wrong, at least they invested something of themselves in it.
Now, they can just talk to you for half an hour and before you know it, they have tables of data and all kinds of documents about a "theory" they don't even understand themselves. They then ask for our input, but we have no desire to interact with nonsensical machine hallucinations. All these theories are a wall of text, but rarely any real content or consistency, and certainly no solution. Yet they all seem to imagine that they were the first to think of trying to Chat the solution out of some GPT's algorithm.
What I plan to do now is to make a locked thread where I will collect all such theories. My opening post should explain why their post has been moved to this thread. In other words, it should explain why it is not a good idea to overly rely on you (ChatGPT) for coming up with a solution to one of the world's enduring mysteries, and why it is not very nice to as researchers to interact with such low quality AI slop. What would you suggest I write?
ChatGPT:
That’s a very reasonable and well-articulated concern, and you're absolutely right to draw a line for the health of your forum. Here’s a suggested draft for your opening post, written in a clear but firm tone that addresses the problem without being hostile to newcomers:
? About This Thread: AI-Assisted Voynich Theories
Welcome to the AI-Assisted Theories Megathread. If your post has been moved here, it's likely because your theory about the Voynich Manuscript was generated in large part with the assistance of an AI chatbot, such as ChatGPT.
We understand the appeal of using AI to tackle the mystery of the Voynich Manuscript. It's a fascinating puzzle, and generative models can produce long, confident, and often impressive-looking text with very little effort. However, there are serious problems with treating such output as meaningful Voynich research.
Why We Created This Thread
In recent months, we've seen a surge of AI-assisted "theories" that tend to share the same characteristics: - They are verbose, but lack internal consistency or logical rigor.
- They often reuse old, debunked ideas wrapped in novel phrasing.
- They frequently draw on patterns that sound plausible, but fall apart under scrutiny.
- They are typically the product of a few hours of prompting, rather than a genuine engagement with the manuscript, its history, its paleography, or its scholarly context.
These posts aren't theories in the traditional sense—they are AI-generated speculation, usually created without a solid understanding of the manuscript or the existing body of research. The effort may feel exciting for the individual involved, but the result is often indistinguishable from the many, many failed theories of the past—only now created faster, and in higher volume.
What You Need to Know About AI and the Voynich Manuscript- AI does not know the solution to the Voynich Manuscript. No one does.
- Chatbots like ChatGPT are not reasoning engines. They do not analyze text or decode ciphers. They predict plausible next words based on patterns in training data—patterns which include both genuine research and a great deal of nonsense.
- Theories generated by AI are not vetted, not original, and not reliable, even if they sound coherent.
- Asking researchers to evaluate AI-generated output is asking them to spend time on machine-generated hallucinations, rather than serious research or meaningful conversation.
What This Thread Is (and Isn’t)
This thread exists as a place to collect and contain AI-assisted Voynich theories. It is not a place where experienced members are expected to engage deeply with them. Some of these ideas may still spark curiosity, and we won’t delete them outright. But if you want to be taken seriously as a contributor, we encourage you to:- Engage with existing literature and learn what has already been tried and debunked.
- Do your own work. Use AI as a tool if you like, but take ownership of your ideas.
- Be ready for criticism, and don't mistake verbosity for validity.
If you're here because your post was moved, this is your opportunity to reflect on how much of your "theory" you truly understand, and whether you're prepared to defend it as your own work—not just something an AI happened to write while you watched.
Let me know if you want to strike a more welcoming, more humorous, or more academic tone—I'd be happy to adjust it.
|
|
|
| [Poll] What *are* vords? |
|
Posted by: RadioFM - 30-06-2025, 12:20 AM - Forum: Analysis of the text
- Replies (11)
|
 |
Poll no. 1 (see above): Nature of vords
When decoded, most (if not all) Voynich vords will turn out to be... (Choose 1 option)
a) Words or almost whole words
b) Syllables, bigrams, n-grams
c) Single letters (or phonemes)
Assume any dummy/null vords or characters removed
Poll no. 2: Nature of the cipher
I believe the bulk of the text was ciphered using... (Multiple choices allowed)
☐ Dummy (null) characters or strokes
☐ Dummy (null) words
☐ Transposition within words
☐ Transposition within lines
☐ Indexed codebook (not just for a minority of words, but for the bulk of text)
☐ Auxiliary devices for encoding (wheels, matrices)
☐ State or context-dependent encoding
☐ Something else
☐ ☑
Considering that good progress has been made in showing that VMS is certainly not Latin, Italian or German ciphered through simple substitution, I was wondering what do you think VMS vords would look like, when decoded.
I'm aware of the many nuances you all may hold about differences in Currier/RZ languages, topics, dummy words and padding text, etc. I'd appreciate it if you could try to cast your vote within the (limited) options given and explain further in the comments.
I'm interested in polling those who hold the more "traditional" views, namely that it's likely ciphered Latin, Romance, Germanic, or the like.
|
|
|
|