10-04-2023, 11:04 PM
I wrote an article relating to the Voynich Manuscript for LessWrong, a website dedicated to rationality and AI Safety research.
You are not allowed to view links. Register or Login to view.
The article is meant partly as an introduction to the topic of the VMS for that crowd, many of whom may not be familiar with the VMS. So, you might skip the first few paragraphs.
The meat of the article is the question of whether Bing AI (or, I suppose, one of the other AI large language models, or LLMs, like ChatGPT) can generate plausible continuations of Voynichese text.
Why might we in the Voynich community care? Well, if an AI LLM can generate high-fidelity continuations of Voynichese text that reproduce all of the statistical regularities of the original VMS, then that suggests that, somewhere inside the "Giant Inscrutable Matrices" of these LLMs, these LLMs are able to model how Voynichese is produced, which might eventually give us some clues as to how Voynichese was produced. I say "eventually" because, even if we could prove that this current generation of LLMs can generate high-fidelity continuations of Voynichese, this current generation of LLMs won't have the "insight" to know how they are doing that. We would need a later, more powerful generation of LLMs to inspect the "Giant Inscrutable Matrices" of these earlier LLMs to eventually make the rules explicit to us.
Why might we suspect that Bing AI could possibly generate high-fidelity continuations of Voynichese text, i.e., "speak Voynichese"? It's because that's basically the sort of task that these LLMs are tailor-made for: predicting the next token across essentially all of the languages and domains of the Internet. We also know that the most recent generation of LLMs can essentially create and decompress their own languages to themselves across sessions. Even though the size of the VMS corpus is relatively small compared to the size of all of the text on the Internet on which they were trained, if the rules behind predicting the next Voynichese token end up being simpler than the rule of predict the next token across any domain you might see on the Internet, then there is reason to believe that these LLMs might know Voynichese.
I explain the basic idea here in the article:
So, in this article, I take my first stab at getting Bing AI to generate a continuation of Voynichese text. I also (futilely) try to get Bing AI to explain its method. Unfortunately, I don't think this will work as a backdoor method to find the way Voynichese was created because no current LLM has that much insight into how it decides to do the things it does. It would require a later, more powerful LLM to go back and analyze what Bing AI was doing here.
But before we get there, I wrote in the comments under the article some suggestions for, if someone wanted to continue this project to really rigorously find out how well Bing AI can generate Voynichese, how I would do it:
1. Either use an existing VMS transcription or prepare a slightly-modified VMS transcription that ignores all standalone label vords and inserts a single token such as a comma [,] to denote line breaks and a [>] to denote section breaks. There are pros and cons each way. The latter option would have the disadvantage of being slightly less familiar to Bing AI compared to what is in its training data, but it would have the advantage of representing line and section breaks, which may be important if you want to investigate whether Bing AI can reproduce statistical phenomena like the "Line as a Functional Unit" or gallows characters appearing more frequently at the start of sections.
2. Feed existing strings of Voynich text into Bing AI (or some other LLM) systematically starting from the beginning of the VMS to the end in chunks that are as big as the context window can allow. Record what Bing AI puts out.
3. Compile Bing AI's outputs into a 2nd master transcription. Analyze Bing AI's compendium for things like: Zipf's Law, 1st order entropy, 2nd order entropy, curve/line "vowel" juxtaposition frequences (a la Brian Cham), "Grove Word" frequences, probabilities of finding certain bigrams at the beginnings or endings of words, ditto with lines, etc. (The more statistical attacks, the better).
4. See how well these analyses match when applied to the original VMS.
5. Compile a second Bing AI-generated Voynich compendium, and a third, and a fourth, and a fifth, and see if the statistical attacks come up the same way again.
There are probably ways to automate this that people smarter than me could figure out.
You are not allowed to view links. Register or Login to view.
The article is meant partly as an introduction to the topic of the VMS for that crowd, many of whom may not be familiar with the VMS. So, you might skip the first few paragraphs.
The meat of the article is the question of whether Bing AI (or, I suppose, one of the other AI large language models, or LLMs, like ChatGPT) can generate plausible continuations of Voynichese text.
Why might we in the Voynich community care? Well, if an AI LLM can generate high-fidelity continuations of Voynichese text that reproduce all of the statistical regularities of the original VMS, then that suggests that, somewhere inside the "Giant Inscrutable Matrices" of these LLMs, these LLMs are able to model how Voynichese is produced, which might eventually give us some clues as to how Voynichese was produced. I say "eventually" because, even if we could prove that this current generation of LLMs can generate high-fidelity continuations of Voynichese, this current generation of LLMs won't have the "insight" to know how they are doing that. We would need a later, more powerful generation of LLMs to inspect the "Giant Inscrutable Matrices" of these earlier LLMs to eventually make the rules explicit to us.
Why might we suspect that Bing AI could possibly generate high-fidelity continuations of Voynichese text, i.e., "speak Voynichese"? It's because that's basically the sort of task that these LLMs are tailor-made for: predicting the next token across essentially all of the languages and domains of the Internet. We also know that the most recent generation of LLMs can essentially create and decompress their own languages to themselves across sessions. Even though the size of the VMS corpus is relatively small compared to the size of all of the text on the Internet on which they were trained, if the rules behind predicting the next Voynichese token end up being simpler than the rule of predict the next token across any domain you might see on the Internet, then there is reason to believe that these LLMs might know Voynichese.
I explain the basic idea here in the article:
Quote:Humanity is essentially in the same relationship to the VMS as AI large language models (LLMs) are to the entire textual output of humans on the Internet. The entire Internet is the LLM's Voynich Manuscript. This might help give people some intuition as to what exactly LLMs are doing.
The LLM starts off with no clue about human concepts or what our words mean. All it can observe is statistical relationships. It creates models for creating that text that allows it to predict/generate plausible continuations to starting text prompts. In theory, with sufficient statistical mastery of the text in the VMS, humans should be able to simulate a process by which to generate increasingly-plausible-sounding continuations of "Voynichese" in the same way that AI LLMs generate plausible-sounding continuations of English or Japanese, even if humans never "understand" a single "vord" of Voynichese. As our process becomes increasingly-good at generating continuations of Voynichese that obey all of the statistical properties of the original distribution, we might say that humans would be asymptotically approaching a high-fidelity simulation of the process (whatever that was) that originally created the Voynichese.
So, in this article, I take my first stab at getting Bing AI to generate a continuation of Voynichese text. I also (futilely) try to get Bing AI to explain its method. Unfortunately, I don't think this will work as a backdoor method to find the way Voynichese was created because no current LLM has that much insight into how it decides to do the things it does. It would require a later, more powerful LLM to go back and analyze what Bing AI was doing here.
But before we get there, I wrote in the comments under the article some suggestions for, if someone wanted to continue this project to really rigorously find out how well Bing AI can generate Voynichese, how I would do it:
1. Either use an existing VMS transcription or prepare a slightly-modified VMS transcription that ignores all standalone label vords and inserts a single token such as a comma [,] to denote line breaks and a [>] to denote section breaks. There are pros and cons each way. The latter option would have the disadvantage of being slightly less familiar to Bing AI compared to what is in its training data, but it would have the advantage of representing line and section breaks, which may be important if you want to investigate whether Bing AI can reproduce statistical phenomena like the "Line as a Functional Unit" or gallows characters appearing more frequently at the start of sections.
2. Feed existing strings of Voynich text into Bing AI (or some other LLM) systematically starting from the beginning of the VMS to the end in chunks that are as big as the context window can allow. Record what Bing AI puts out.
3. Compile Bing AI's outputs into a 2nd master transcription. Analyze Bing AI's compendium for things like: Zipf's Law, 1st order entropy, 2nd order entropy, curve/line "vowel" juxtaposition frequences (a la Brian Cham), "Grove Word" frequences, probabilities of finding certain bigrams at the beginnings or endings of words, ditto with lines, etc. (The more statistical attacks, the better).
4. See how well these analyses match when applied to the original VMS.
5. Compile a second Bing AI-generated Voynich compendium, and a third, and a fourth, and a fifth, and see if the statistical attacks come up the same way again.
There are probably ways to automate this that people smarter than me could figure out.