David and I had a great chat with Nick Pelling about a range of Voynich related topics. These include how he got into Voynich research, his views on how the study has progressed and how to advance it, and much more.
The interview can be viewed below. By now we've ironed out the technical difficulties that were plaguing our earlier interviews
EDIT:
Transcription of the first part:
-
David: Hello and welcome to this Voynich Ninja web chat with Nick Pelling. Nick is a well known author whose book "Curse of the Voynich" is very widely read and respected. He's also a well know cipher enthusiast who blogs from ciphermysteries.com.
(skipping rest of intro)
-
David: Nick, I'm going to start off with an easy question: how did you first learn about the VM?
-
Nick: well, that's a computer game question. Many years ago, I was writing computer games and I got a bit bored of it and started writing a novel. The novel had knights templars and Opus Dei and all kinds of things that nobody had heard of back then - it was quite fun. Then a friend of mine said: "oh, Nick, you're into this history stuff, can you check the backstory for my game?". This was Charles Cecil, who was writing the game Broken Sword. His backstory had all the usual nonsense: pyramids, infinite power, Freemasons... And one of the things he used was the Voynich manuscript. And I thought: that's odd, I thought I knew all about ciphers since I'd always been interested, but I hadn't heard of the Voynich at that point. So I wanted to have a look at it.
Six months later I came back to it and I was hooked. Pretty much the first thing I thought was: here's something that could really benefit from clear thinking. I really want to do this. And I haven't stopped writing, cause the Voynich is just too interesting. It couldn't compete. (This was about 2001)
-
Koen: I imagine you couldn't find too much online at that point.
-
Nick: No, there wasn't much at all. I had the copy-flow from Beinecke, a blue colored black and white thing, you could never quite read stuff on it. But we made do, and the Voynich list was actually very good back then, so that was a very good source of information - and interesting people!
-
Koen: would you say that back in the day, there was more of a feeling like "everything is still up for grabs, for us to discover", while now the feeling is more "everything has been written already"?
-
Nick: I disagree with any of that really. Back then, the main feeling was one of a shared community; it was quite small, compact, coherent. You had two Jims, Jim Reed and Jim (?) and Jacques Guy, and Rene and Gabriel Landini and a whole load of people who were all into it and were all smart, and not really fighting at all. It was a different sort of vibe to what we have nowadays, and it was really great.
-
David: I think we've got to a stage now where we've discovered everything that's easy to discover in the MS and it would take a good amount of academic research to progress from that. Would you agree to that, or do you think there's still easy balls out there?
-
Nick: I think there's tons of stuff that really needs to be looked at urgently. People keep writing papers and assuming things as if we know everything. We've made a lot of observations of scans, but we haven't really made all the scans we can. For example, just look at the last page. It's the lowest hanging fruit of the whole MS, cause it's not in code, it's not in cipher; it's just a mess. Now, someone could very easily take that to pieces in terms of what was being written there, what were the different layers, what was amended, how it was constructed. A real forensic codicology job. At that point we'd have a whole load of really basic things, that we could compare to other texts, othr manuscripts. We'd have a language, and we could say this was written in this dialect, in this place... and then we could go to the archives and go, OK, let's have a look, let's find all the other instances of this language in this place and broaden our minds. All of a sudden we'd have a very different kind of conversation, by that single page. This single page seems to me, because it's got Voynichese on it, to have been written at the very time that the author wrote the manuscript, perhaps even by the author himself. That's the biggest thing you would want with any kind of mystery, you would try to get as close to the point of composition as you can.
-
Koen: Right, so you say that what we are missing to really get to the bottom of this is someone doing more studies on the physical manuscript itself?
-
Nick: exactly. That one page! That one page is a really good example of a non-cryptographic mystery. There isn't any doubt in my mind that it's not cryptographic. I think what's happened in that page as I said in "curse" in 2006 and before, is that this page got fainter and fainter and fainter. And then someone came along, not realizing that in 400 years' time we'd be able to enhance everything with computers. And they saw this faded page, and they said, "let me try and fix it", and they've written over the top of it with their best guesses of what they could see. Unfortunately it's ended up a mess. I'd be entirely unsurprised if the person who was trying to fix it up didn't actually know the language in the first place and was just making best guesses. So it's a multilayered codicological mess. And yet people continue to write *slaps table in frustration* papers on it about billy goat livers and all the rest of it. And that's great, I mean it's lovely that they've been involved and they've tried, but but they just are skipping the difficult thing, which is how do you physically read it? Not how to interpret it, but how to physically read it in the first place. Just having high res scans from Beinecke is partially enabling, but in this case it's just not quite good enough.
-
Koen: if you look at it on the surface and you compare it to marginalia in other MS, it's still really difficult.
-
Nick: Unlike the rest of the Voynich, I don't believe this is a cryptographic mystery. That one page. I believe that what we're looking at is layers of historical interpretation.
-
Koen: so you think someone messed it up?
-
Nick: Yeah, but with good spirit! Trying to do their best. But what codicology is all about is taking things that are hard to read and using physical imaging techniques to separate out the layers.
You can have what's called "glancing illumination", where you shine light really close to the surface, so you can see the indentations - you can't see the colors - where the strokes are made, on a really microscopic level. That shows you what was written. You can look at the width of the strokes and reconstruct the layers of contributions that make up the page.
And at that point, we can start talking about billy goat livers and all the wonderful things that people have suggested. But until that point, it's just premature, which is a shame. There are things that we know we should know, that we really should be looking at. But there seems to be an assumption that we have everything, and we don't.
-
David: we've been talking about marginalia, especially the last page, quite a bit on the forums. There seems to be a consensus from different researchers that it's Middle Highe German of some sort. You've said before that you think you can read a name "Simon" in it. Do you still think that?
-
Nick: I don't see why not. The top line has sections that seem messed around with.
-
David: the previous page, you mention the part you suspect that contains a colophon from the author, why do you think that?
-
Nick: if you look at all the preceding pages in quire 20, what you see is lots of short, starred paragraphs, of a certain rhythmic format. Fast forward to 116r, and we have this end paragraph. We know this is the end because the following page is blank. If you look at that final paragraph, there's a nice big gap, a nice ornate gallows at the top and a very awkward looking first word and then a whole bunch of text which doesn't look at all like the kind of text on preceding pages. And it's the very last thing that's written. That is normally where you put a colophon. If we want to understand quire 20, perhaps we should not use that last paragraph.
Even if we can't read individual words, we still should be smart enough to read the flow of the document as a whole, and just work out what the bigger picture is.
-
David: talking about that, have you made any progress on your block paradigm?
-
Nick: in code breaking generally there are two types of attack: one where you try to attack the system (working backwards). You can also do a forward attack; if you find out what the plaintext is, you can work forwards from there to the ciphertext.
In WWII this happened all the time. Messages were often sent through different channels, so you have an enigma channel, and let's say the Japanese ambassadore (?) channel. The same message would diverge in two paths. You might be able to break this one, but not the other one. So if you can break this one, you have a different way of attacking the other one. A forward attack from a known plaintext.
Now, in the case of the VM, it is a bunch of different things, there's no reason for thinking it's a simple shopping list or something. There's the herbal bit, baths bit, astro bit... It seems very likely they've all come from different sources. It's 200 pages of stuff, and the pattern changes.
Koen: It's what we would expect from a manuscript like this.
Nick: yeah, so we don't have a chance of finding a whole plaintext, unless we happen to find "the Voynich Manuscript" in clear somewhere - which is not likely to happen at this point. But at the same time it may well be that if we identify particular documents that have particular structures or particular subjects.
There's a good chance that the last section is a set of recipes, short things. So if we find a document with the same structure there's a good chance we might be able to use it as a known plain text; we might be able to identify a block - a paragraph or a page is enough - to help us look at the Voynich text, how does that map to that? Even a single word would be enough. If you go upwards from a word, maybe a sentence or a paragraph, then you should have everything you need to understand the VM's writing system in every single detail.
-
Koen: have you ever found something where you thought: this is my block, this is it?
-
Nick: yes I have! For a while I was convinced that I was on the trail of a 15th century recipe book that was going to be the same as quire 20. I pursued it, but in the end I don't believe that there is a good version of it. There are bits of it that are copied into other recipes. That was a strong attempt on my part. But the importance of the block paradigm is using an attack on just a small block. The important thing isn't that we haven't found it yet, the importance is that it gives us a mechanism we can all agree on, I think, that will break the VM. Whatever its language, if we can identify the before and the after we can map it.
It's a way I can suggest to people: this is how we can collectively break the VM. That for me is a revolution, because in a way I can stop worrying about the cipher or the language of the text, and start worrying about how I can work with other people to find the source of these blocks.
-
Koen: I agree because even if the craziest, most unimaginable things have been done to the text, we still have a chance of deciphering it.
-
Nick: exactly.
-
David: so you still think it's definitively a ciphertext, you don't think it's some sort of exotic natural language?
-
Nick: I think we can rule out every single natural language. every single one. There's a whole bunch of reasons for it; the language is inconsistent from section to section, it exhibits properties that no single language, however exotic would do. There is evidence of abbreviation, that's different in different sections.
Now, if you don't start by saying "it's natural language but it's abbreviated in some way" - Ah, now we're starting to talk in an interesting conversation! But people kind of shut off, because they tend to see natural language as an either/or thing. Either it's a pure natural language or some other ???? I'm not interested in. That kind of either/or is such weak thinking, such pathetically weak thinking, that we have to do better than that.
It's not a simple cipher, it's not a ???? cipher, I don't believe, it's not a Albertian polialphabetic cipher. We can get rid of all of those three, and at the same time we can also get rid of exotic languages. And we can get rid of random anagrams. We can say exactly what it's not. But until we start to figure out what's on the other side of the wall, we struggle. It's a combination of things, and they contribute in some kind of tangled and clever way. It's probably much simpler than we think.
-
David: could it be gibberish?
-
Nick: No, no, it's too structured. Someone could not sustain that. For example, quire 13 and quire 20, they use language in different ways. If you look at the starts of words, they're just different. I think that's direct evidence that the system used for Voynichese evolved over time. That's what Currier called Currier A and Currier B.
-
Koen: But Currier also linked his A and B to different hands, so if it's for example a transcription effort, it could be that they just transcribed in a different way, which does not imply an evolution over time.
-
Nick: I think there's an interesting question to be had. But this is a real.... we're coming into the zone of questions that people haven't asked. Since Currier wrote his papers in the 1970's, too many people have just sat back and said "okay, that's that", but actually that's just the starting point. What annoys me is that even though we've got much better transcriptions than we had 40 years ago, nobody seems to be trying to build on Currier. In fact, the person who most cited Currier is Stephen Bax, and cited him in order to diss him, to put him down. This is madness! The one good piece of work we've got, and Bax starts putting it down. Well no, no no! We have to be better than that. We need to build on this stuff.
-
David: What sort of statistical studies do think can be done on the text? And do you think the electronic transcriptions have any value in that, or should we be working off the original text?
-
Nick: I think the transcriptions are hugely useful, and EVA is smarter than anybody seems to understand. People seem to have forgotten the point of EVA. Before EVA, everybody had their own transcription, they would fight over just the transcription for weeks. EVA stopped all that... However, people still haven't figured out what the shapes are.
-
David: it's still an assumption when people are using these transcriptions, that people think that these are the glyphs in the book, and that's written in stone. There's no attempt to think "well actually, this EVA character may be composed of two glyphs in the book".
-
Koen: yeah there could still be parsing errors in EVA which kind of influence the way we think about the characters.
-
Nick: there are plenty of those. For example, in "curse" I talked about the flourish on at the end of the [n] character. I pointed out one page where the final strokes were written in a different ink. How's that? Why would the final strokes be written in a different ink? Maybe they were written in a separate pass? Maybe what was originally inscribed on the page was a dot above the [i]'s of [daiin]. There's a medieval cipher where you'd put dots in different positions to identify different characters. And maybe the end stroke was brought down from that dot down to the bottom. Specifically in Currier A, Currier B is different. It was like they said "dots is a giveaway, let's hide it more". And they patched it up, so it's a two stage thing. So it would be really good to examine that page and other pages - there's one in the pharma section - to try and really grasp what's going on there. Cause if that's the case, there's something really funny about these end characters that we're not capturing in our transcription. If that is the case, then we're all over the place, then we're looking at the wrong thing.
Another thing about EVA that's problematic is the issue of spaces and half spaces. This is not going away. I know that Glenn Claston used to complain about the different flourishes on the [sh], that's definitely a thing that should be captured. He captured it in his 101 transcription.
But even with those things, EVA is still very worthwhile. But the problem is that people just assume that we've nailed it, and that's the last word. It's not the last word, it's an early word. It hasn't been backed up by the low-level, detailed codicological studies to try and work out if it actually matches what we see.
-
Koen: so it may be like David implied that it might be worthwhile for some people to return to the scans to actually read the text to get a feeling for it again, instead of just relying on a text file with the EVA transcription in it.
-
Nick: it's a really good exercise that anyone can do, it costs nothing. Pick a page, any page, and transcribe it yourself. Don't take anyone else's word for it, just try it. Even if you try one paragraph from an A page and one paragraph from a B page, you'll learn more about EVA and how it varies than most of the people talking about EVA. Anyone can go yeah yeah qokedy blablabla, but actually doing the transcription you'll kind of grasp what a task of epic proportion is was for Takahashi and Glenn and others to put their transcriptions together.
-
David: probably a lot of work (?) yeah
-
Koen: I don't even wanna think about it!
-
Nick: exactly! And people wouldn't even try to scan a single page. Just try to do a paragraph and then start talking about EVA. You'll get a hugely different outlook on it.