[Transcription] Stephen Bax on his research
Koen G > 21-09-2017, 09:27 PM
These parts of our interview with Stephen Bax are specifically about his own research:
[David] My understanding of your paper is that you're not translating anything, but instead you have identified ten likely nouns in both paragraphs and labels, which you assume identify the accompanying images. You have then assigned sounds to these fourteen syllables contained within the words and discovered that the sounds correspond within these words, indicating that the words probably can be pronounced, so that they are real words. This makes it likely that these words are nouns, which can then be transliterated in their English equivalents, which you are doing by identifying the accompanying images next to the text. But you are not proposing either a language or a translation. Is that a fair summary of your work?
[Stephen] To some extent. It's a little bit - you've implied that basically we identify words and then we break them down up into sounds and symbols. It's actually a bit back and forward. If you identify sounds and symbols, then you build up the words. A good example would be the word which might be the word "Taurus" to identify the seven sisters stars [Pleiades]on the page with the smiley moon. If you look at that, you say: this looks likely to be the Pleiades in the constellation of Taurus, so therefore the label alongside most likely represent something to do with that. If you the say: this is probably the word Taurus in one for or another - it *could* be the word Taurus, you have to be very tentative at this stage - then the first letter may be a /t/ sound of some sort. But you can only corroborate that if you find other similar matching words through the MS where that/t/sound is also most likely a /t/. You build it up with a process, but each one you do must be speculative and provisional, until you get an overarching scheme in which several of them coincide and you seem to see some light. So that was in a sense the process that I followed and built up the scheme which is in my 2014 paper, and you can quickly find the proposed identifications in its appendix. So yes, in a sense, your summary is a fair way of putting that, but you're right also that the end result is not an overall translation of the MS, of the language or the script. It's first of all a methodological procedure and secondly it does seem to bear some fruit when you start to match it with other possible words in the MS. The point is then that you are coming to a decoding of each sign which should in the future allow you to generate other words as you go through the MS. But it's a very long, slow and torturous process. But you're right in a sense that it's not promising "we have now deciphered the MS, we have found out what the language is". It's more of a methodological approach to go step by step to identify certain patterns in it. It was very specifically identified as a very provisional and partial attempt to make some headway.
What I think is most valuable is the methodology behind it. Some of the identifications I would still stand by, although others are less solid. But it's the methodology which I would stand by as the most interesting part of it.
[Koen] You've said that the underlying language is likely Near Eastern, Caucasian or Asian in origin. Is this still something you think is the most likely?
[Stephen] I wouldn't say that the underlying language is Near Eastern, we just don't know at this stage. But what I did say in my paper is that there seem to be elements which derive from Near Easter, Caucasian or Asian aspects. For example, some elements seem to come from a Persian origin. But that doesn't mean that the language itself is from those families; it could simply be borrowings. It could be that all of the words I identified are borrowings into the language. That's common with nouns, when you think of almost all the names of exotic fruits and vegetables, they derive from other languages, like "artichoke", "avocado"....
[Koen] Especially in these scientific contexts you get a lot of jargon.
[Stephen] Absolutely, a huge amount of borrowing. So I think we're not very far into suggesting what language families we're dealing with. I do think that some of the work by Marco Ponzi and Rene, Darren Worley and others on the Zodiac figures, suggesting an East European or Central European origin for some of those illustrations is quite telling and convincing. I'd not be surprised if the MS was a European production, but that doesn't necessarily mean that the language within it is a European language. If you take the example of Romani, sometimes known as the Gypsy language; if the VM is written in Romani language in a script specifically devised for the VM, it could well be written in Europe, but the origin of that particular language is far from Europe. So we could be dealing with a mixed and unexpected language, but yet produced in Europe.
[Koen] I'd like to pick in on the Romani suggestion - that was not your own idea, I think, it was developed by others on your blog? What is your view on this possibility?
[Stephen] Derek Voght has produced some nice videos on YouTube about it, and others have suggested it in the past as well. I haven't ruled that out as a possibility because basically, the most likely language that underlies the VM script is a language that has not been written down before. It's quite likely that a small group of scholars has developed a script to encode a language that they communicated in normally and to transmit the ideas of the time in terms of herbals etc into that particular group / linguistic culture. If you'd say "could it be Italian?" - well no, because the Italians in that period already had a script. So you then start to look for languages which were prominent or strong in one way or another, or had a significant community of speakers, but yet did not yet have its own script, and therefore needed one to be developed. Then you come to things such as Romani or others - something like Hungarian, say, which had a prominent group of speakers but did not have a strongly used script at the time. So Romani is not impossible - there would have been a need for some group of scholars to create a script for themselves. We still have too much work to do to establish what it is, but I think the arguments are very interesting. And I really appreciate the attempts by Derek and others to investigate and see whether this particular hypothesis holds water. I think it's great when people do that kind of work.
[David] You've concentrated on the first word of each herbal page, in the reasonable assumption that this is likely to be the name of the plant. Would this be common in herbals in the languages outside of Europe?
[Stephen] First, let me clarify about the first word of each page. It's perfectly possible that the first word of each page is not the name of the plant. It could be within the first or second lines. One assumption is that we will start by looking at the first word, and actually in the VM the first word on many of the herbal pages does seem to be a unique word - an infrequent word in the VM - so this could be the name of a plant. But that's definitely not an assumption you must stand by. You just explore whether it's possible. For example, the "Centaurea" page, for me that's a watertight identification, virtually nobody has suggested another possibility. And the fact that the first word of the page, and also the word of the second paragraph are virtually identical, does make it plausible that this might be the name of the plant in the picture. This is speculation, since you've just got one plant, one word, maybe a second word, but when it becomes more interesting is when you combine that with other identifications, and build up a system like the one I was trying to build up in my paper. But any single identification must be speculative! You can't insist on it. Building it up as a pattern is what's starting to give it a bit more credibility.
But to come back to your actual question: often in Western herbals, the first word on the page with the representation of the plant, is the name. And that is actually common in other traditions, for example in Arabic herbal MSS,very often it's the first word, but not always! Sometimes it's a word later on in the first line, which is then underlined or highlighted in some way. But again: you are looking for likely words that might be the name of a plant, but we can't insist that it must be the first word, that's just the first place you look, but you look at it very cautiously and skeptically and you use your judgement.
[Koen] In another MS I've been studying, you always had a fixed phrase at the beginning, for example "nomen herba" and then the name of the herb. We're unfortunate that such phrases at all do not appear to exist within the VM.
[Stephen] Yeah, and particularly we're unfortunate that they didn't somehow highlight the name in a different ink, or underline, which they do in some herbal MSS. That leaves us fighting around to identify which word it might be.
[Koen] Now, we've been talking about the difficulty of identifying the plants, but that's not really your area of expertise so we'll leave that for a discussion with a medieval herbalist. But your transliterations are heavily based on the imagery. When you started writing your paper, how did you select your sources. For example you relied heavily on Sherwood's plant identifications.
[Stephen] I relied on any identifications of people who seemed to have authoritative arguments for why this plant was the plant that the identified. But also I did look back at other historical herbals to look at the way in which the plants were represented there, looking also at herbals from other traditions.
There's a Finnish biologist who put some very nice discussions on my blog about how difficult it is to identify some of these plants and some of the factors behind why these plants in the VM might look so peculiar and strange. Expecting that many of the plants may not have been drawn from life, or from dead samples, you have to use a lot of interpretations to say what this plant might be. But there are some which I think are pretty convincing and which nobody seems to differ in terms of that being the plant. An interesting example is the juniper which I read as oror or arar, but many people have identified it as the cannabis plant, so it's difficult to say what exactly this plant is.
There is an element of uncertainty, of judgement, of accepting the inevitable variation to do with the drawings, the script, and it is a slow, tentative process, which needs to be built up over a long time. It's not something like "I looked at this plant and immediately I knew what it was and that the language is Hungarian". That kind of too speedy attempt to work through a MS Like this is just foolish and doomed to failure. And actually that reminds me of the TLS article which came out recently, which... can you remember the author of that article? He basically said that the script was entirely a set of Latin abbreviations.
[Koen] Nicholas Gibbs was his name.
[Stephen] That, for me, is an example of what I call pick and mix. Where you simply take things where you think this looks like this and this looks like that, theefore... it's taken me 25 minutes and I've identified the MS as this. And that, for me is just foolish. That's not scholarship, it's not research at all. But the problem is, it is very attractive to online newspapers to publish that kind of thing because they get a huge number of clicks. It's essentially clickbait. Your advertising revenue goes up. But if you spend ten minutes reading an article like that and you know the VM even to an intermediate level, you realize that it's not worth bothering with.
[Koen] We were surprised that he got published at all because we see these kinds of theories every day from all kinds of people, and one gets published and it's suddenly all over the internet.
[Stephen] I think the reason is clickbait, because they know that as soon as you put "Voynich" into something, you get a huge number of people clicking on it, and then you can say to your advertisers: "look, we've had a million clicks today!" But I agree with you, it's amazing that the TLS would publish something like that with so little research! I think it had almost no references to other work or scholarship in the field.
[Koen] Even though in the 1990's already some of his ideas have been suggested on the mailing list, and he doesn't refer to anything.
[Stephen] These ideas of Latin abbreviations have been around for decades. Some of the oldest accounts of the VM have been identifying possible Roman abbreviations, but they were not cited in the work! It's very disappointing when this kind of work is given credence, because then anyone who is trying more seriously to investigate the MS is also seen as some kind of hoaxer or waster. That's an aspect of the VM community which I try to avoid because it doesn't help in the ultimate endeavor of decoding and understanding this fascinating MS.
[Koen] Now that we're talking about the VM and the media: you, yourself have become almost synonymous with Voynich research. Many people started researching the MS because of your paper. How did you experience this, did you expect this at all that it would be such a hit?
[Stephen] I think you're very generous to say so. To some extent you could say that what I did three years ago has been forgotten by lots of people as well, because it is such an ephemeral world. You get a new fantastic theory which everyone says is the latest one and they forget what happened even three years ago. I hope that the approach that I adopted to the MS is one that is of value. I still would stand by it, and that is the very best way that we're ever going to make a dent into the script and the language.
And also, I have to insist again and again, the difference between the script and the language. is a really important one, because it could be that the underlying language is one that is known to us, but what stops us from getting there is just the script that's lying above it. And this is an element of the research which many people just don't seem to get. Even in recent discussions on the internet you get people confusing the issues of the script and language.
If we look at the work that's been produced more recently - I think Marco Ponzi is someone whose work I really respect, and he's done some interesting stuff trying to identify symbolism using arthistorical comparisons. For example one of the pages seems to represent spring, summer, autumn and winter. By looking at the clothing of the different characters depicted on the page. And that kind of thing is useful and well beyond my capacities. But it can lead us to look closely at the words around the figures and say, is there one word here which might be the word "spring"? etc And then try to work out what those words might be, how the letters link to other words etc and do it in a very systematic way, and gradually - over years - move towards identifying a possible underlying language.
But it is a long, slow endeavor, and the problem with the media is that they want quick, immediate solutions. If you look at the decipherment of linear B in the 1950's, that is an endeavour that started seriously in the 1920's! With a very systematic approach, following more or less the same approach that I've been using. It took 30 years to identify the fact that the script was encoding Greek! A language everybody knew. It was almost staring us into the face, but because the script was so opaque, it took 30 years of systematic and careful work with a number of different people, and many scholars in Greece actually insisting it was not Greek, and then it was shown that it was actually Greek. And that took 30 years. So for us to expect an overnight discovery is really misunderstanding how translation and decipherment works in this kind of case.
[David] To pick up on your example about script and language: over here in Spain there's a late medieval tradition of ???? where the Arabian speakers of Al-Andalus weren't allowed to write Arabic, so they kept writing Arabit, but using the Latin script, which was legal. These manuscripts are fascinating because they appear to be written in no language we understand, but you have to read them aloud, and that language turns to Arabic.
[Stephen] That also illustrates the fact that in medieval Europe there was a huge mixture of languages and scripts. With standardization, we, particularly in mololingual England, tink: "oh, there's only one language and script, why would anybody mix them?" But of course, people have been mixing them in the way that you describe for centuries! So I think you have to see the VM in that kind of context.
[.............................]
[Koen] As we've said, there has been some criticism on your work, notably by Nick Pelling. One of his points of criticism is that you map three glyphs to the sound /r/. Is that a correct summary?
[Stephen] When I read the critique about the /r/, I almost fell off my chair laughing because many different articles have two sounds in the region of /r/. A very obvious one is Spanish. Pero means "but" and perro means "dog". Those two r-sounds are in the same area phonetically. In fact, in Spanish, they are meaningful, significant differences, a /r/ and a rrrr-sound which are quite different. English has this too, in Scottish, but there they don't happen to have a meaning difference.
Many languages have this. In Spanish they are encoded into the script by a single "r" versus a double "r", but in other languages they are encoded into the script in different letters completely. (Arabic example where three sounds in the region of /d/ get a different letter shape.) That's rudimentary linguistic knowledge. So if the VM happens to have two letters for an /r/, that would be nothing unusual at all! It just depends on the language that it's trying to encode.
Now, the third of them, EVA [m] is in my view a terminal form, a form that is used at the end of words, lines or paragraphs, which is for me a terminal form of EVA [r]. Somebody who doesn't know anything about linguistics will say: "what do you mean, a terminal form? The same letter with a different shape?" Well yes! Arabic is a perfect example of this. You have a different shape of the same letter depending on whether it's in the beginning, middle or end of the word. It's the same letter but with a different shape. So for me as an Arab linguist, it's not at all unexpected to find a script where you have - for example Voynich [r] is the most common form of /r/ through the MS. But you have EVA [m] as the terminal form which is used as a decorative form with a tail. And this character hugely comes at the end of words, lines, paragraphs...
Again, that, for me, is year one linguistics. I wouldn't stake my house on it, because all of this needs to be researched and developed. But the fact that's it's a possibility should be indisputable with anybody with some basic linguistic knowledge.
[Koen] The problem, probably, that Pelling sees is that if you have to reserve one glyph already for a form of a sound that comes at the end of the word, your phoneme inventory becomes really small. That's something you hear a lot.
[Stephen] Really small? How really small? If we've got 24 main symbols, there's still perfectly enough - especially if some of the vowel sounds are not written down. I don't see that as a significant objection. But until we have the full set of sound symbol correspondences, we can't say. But to say that it's a possibility seems to me to be just plain linguistic logic. If that objection is supposed to say "oh, the whole scheme is not worth thinking about", then I think that's just very poor criticism.
[David] I think that indicates that whoever devised the script gave special prominence to that sound. As you say in Arabic...
[Stephen] I think - sorry to interrupt you David - I think that there are quite a few other symbols that could be terminal forms. If you look at a page of the VM script, very often at the end of the word you see a flourishy one that goes down a little bit. That for me as an Arabic linguist, is a very normal feature. The way you write Arabic, very often at the end of it, there's a character which kind of flourishes.
[Koen] The one that looks like a /9/ is like an /a/ with a flourish basically.
[Stephen] Exactly. EVA [o] could have EVA [y] as a flourished version of it, and I think that's an interesting argument.
[.............................]
[Koen] You have focused what you basically think are labels, single words, or the name of the plant. But if you go from there and you take the sound values that you have proposed and apply them to the whole paragraph, you get a questionable result. Why do you think that is the case, that it's so difficult to translate a paragraph, even if you have proposed a value for about half of the glyphs?
[Stephen] The proposed sound values that I put in my paper are not even half, so we still got a lot more to do. But if you then did put it into a paragraph - you say that what you'd have would be gobbledygook, well no, what you'd have would be a very partial representation of a language that we still can't identify. So of course it will look like gobbledygook. The test will be of course, at the end of the day, if we can identify more and more sound-symbol correspondences and build up to 85% of the whole of the MS..... I mean, the key think of course will be when you can use the correspondences that we've got and apply them to, say, one of the star labels, and say, "aha, this really does look like the name of a star!" Then it starts becoming productive and valuable.But we're still a far way away from that. People like Derek, for example, have tried to show that it does actually work productively for some of the star labels. But I think you're right, we still got to do some research before we can move from the individual words onto sentences, paragraphs and so on.
If you transliterate an Armenian sentence into Latin script, then you took out 60% of the letters, and you made the assumption that it was actually a poor representation of the language. What you'd get would be gobbledygook. Nut actually it would be Armenian, but messed up! So I don't see any issue, with the VM, we're still at a stage of ignorance. We have to go, unfortunately, slowly, surely, methodically. And we have to throw away pick-and-mix approaches, which is what we've seen a lot too much recently. Where somebody picks a few things and says aaaah it must be Icelandic.... We've just got to focus very systematically, proceeding step by step, and I think we're in for a ten, fifteen year window.