Back in 2014 the comparison of the central part of the VMs cosmos (f68v) with the 'Oresme' image, BNF Fr. 565 fol. 23, by E. Velinska was a major discovery for VMs investigation. The significant discrepancy between the two representations was the outer wheel and curved spokes of the VMs illustration.
Earlier this year, Linda (I believe) was the first to post to this forum an image of a diagram relating to the Eight Phases of the Moon, based on the work of Anania of Shirak / Anania Shirakatsi, that has a very similar structure of a wheel with eight curved spokes.
It seems reasonable to suggest that a combination of these two illustrations will produce a cosmic representation that has all the structural parts found in the VMs cosmos.
It's now becoming much better known that certain glyph sequences are far more common in either Currier A or Currier B, which is good. But I was wondering today: if certain glyph sequences are so much rarer in A or B than in the other, why do they occur in the other at all? Surely if they were (let's say) 'prohibited' by some implicit rule, we shouldn't see any instances of them at all?
And so I started looking at occurrences of popular glyph patterns in the Currier language where they were less popular. My first target was EVA chd (pink) and EVA shd (cyan), not in B (where they are most popular) but in A (where they are far rarer). Here's a voynichese.com query for this:
You are not allowed to view links. Register or Login to view.
What was immediately apparent to me from this was that in Herbal A pages, EVA chd/shd occur most often either in the last word of a line, or in the last word immediately before the text is interrupted by a drawing, e.g.:
You are not allowed to view links. Register or Login to view. f9r You are not allowed to view links. Register or Login to view. f15r (twice) You are not allowed to view links. Register or Login to view. f24r You are not allowed to view links. Register or Login to view. (four times) You are not allowed to view links. Register or Login to view. f30r You are not allowed to view links. Register or Login to view. f47r (twice) You are not allowed to view links. Register or Login to view. f56r You are not allowed to view links. Register or Login to view. (twice) f93v
I believe that this sits far beyond the realms of simple probability, and I suspect that what we are seeing here is visual evidence for some kind of systematic contraction in A pages in places on the page where the desired text is larger than the space available for writing. I suspect there may well be many more (and not just EVA -m words).
Has anyone done any tests on letter sequence properties specifically on words that are either line-final or are just before an interrupting drawing?
In this video I will show you a hidden cartoon when you overlay 2 pictures. This is 10 videos.
You are not allowed to view links. Register or Login to view.
I was recently thinking about Palindromes (words that read the same backwards and forwards).
All European languages (that I know of) have single word palindromes, but this effect seems to be almost absent in the VM.
You are not allowed to view links. Register or Login to view. are simple three letter words.
The only longer palindromes seem to be unique in all cases with the exception of occo, which appears three times in the manuscript.
Here are the ones I've spotted (quite possible I've missed some, this was only a quick count)
dydyd (f1)
seees (f3)
oeeo (f6v, f72v2, f101v2)
ykaky (f55v)
yekey (f69v)
ylaly (f73v)
lolol (f72v)
shchs (f113r)
The low number of palindromes is of course to be expected, due to the position awareness of glyphs.
It's possible that such palindromes are actually the result of misspellings, and this could give us some concrete examples of such scribal errors within the corpus, allowing us to correct scribal errors and reduce erroneous words from the transcription.
For example, taken at random because it made me laugh out loud, lolol: lo appears 15 times by itself, 182 as a word initial and ol 3052 times as a word final and 538 times as a word by itself. But lo*ol only gives two results, lolol and lolkeol. This second word is more likely to be two words run together, as both lol and keol are common words. This suggests to me that lo and ol have well defined functions, but shouldn't be used together; the scribe made a mistake with lolol, and missed out a space in lolkeol. What mistake in lolol? Well, the prior word is checkho, which is unique. If we move the first l over, then we get checkhol, which appears twice in the corpus. We now have checkhol appearing three times, followed by olol, which appears 18 times in the corpus.
So we have now removed three unique words from the corpus in a logical manner!
No idea if we can do this with the rest of them, it's getting late and I'm tired now. Has anyone any research into this angle, or into reducing the number of unique words by seeing if they can be exploded and reassembled with adjoining words?
This is a subject that has been put forward by Nick Pelling, for instance You are not allowed to view links. Register or Login to view..
NickPelling Wrote:Koen: I’ve been saying for some time that I think the next big “step up” in Voynichese study will come when some clever person finds a way to map between A patterns and B patterns, i.e. to normalize the two (errrm… actually several) parts into a single thing.
But to do this properly, you need to parse A and B, build letter contact tables for them, and then build state machine ‘grammars’ that capture how the two behave – the stuff that’s the same is probably the same, but the stuff that’s different probably involves something that was written as XXX in A being written as YYY in B. Normalizing A/B would involve being able to say “XXX == YYY”. However, this rests on the back of parsing, letter contact tables, and state machines, which (I think) steganographica tricks are disrupting. So I’m still not at all sure how we get over all the technical hurdles to get to a state where we can approach this in a rigorous enough way.
But perhaps some of these XXX == YYY equivalences can be worked out even without all that machinery. For example, I have long strongly wondered whether daiin daiin patterns in A reappear (in some way) as qotedy qokedy patterns in B. Clearly, both involve repetitive “bla-bla-bla” word sequences that are hard to reconcile with either linguistic readings or crypto theories. And given that I’ve previously speculated whether daiin daiin might be enciphering Arab numerals, it would be logical for me to speculate whether qotedy qokedy might be doing the same (but in a different way). Just a thought.
I understand that the subject is extremely complex and I doubt I can contribute much. But I think that Nick has described a promising area for further research and it could be interesting to discuss ideas and possible approaches, even if there is not much hope that we can make serious progress.
My admittedly superficial take to the problem would be to see it as some kind of optimization: find the set of N rewrite rules converting A into B (or vice-versa) so that some measure of the difference between A and B is minimized.
Even this simplistic approach poses a few questions e.g.:
* how to represent Voynchese? (as a first step, I would just experiment with a few different transliteration systems, e.g. EVA, Cuva, Currier)
* how many rewrite rules should be defined? (this is another area where one can experiment with different values for N)
* should one map A into B or vice-versa?
* is it better to compare the whole of A vs the whole of B, or to just consider the more "extreme" sections, e.g. mapping HerbalA into Bio? what to do with the intermediate Astro / Cosmo / Zodiac sections?
* how to measure the difference to be minimized? bigram histograms? word histograms? frequency of repeating word-combinations (which could address the daiin/qokedy issue mentioned by Nick)?
Torsten recently You are not allowed to view links. Register or Login to view. a table of words that seems to me a way to get some "feel" for what is going on. His table "lists the four most frequent 'ch/sh'-words for different sections". He describes the phenomenon as "the shift from 'chol/chor' via 'cheol/cheor', 'cheo/sheo', 'chey/shey' to 'chedy/shedy'".
I expanded on the idea, focussing on ch-words only and extracting the 30 most frequent word types in each section. I used the Zandbergen-Ladini transcription, ignoring uncertain spaces and text-only pages; I joined Astro / Cosmo / Zodiac pages into a single section. Sections are sorted from "strongly-A" to "strongly-B", as discussed by Rene at the end of You are not allowed to view links. Register or Login to view.. For each word, I include the % of occurrences in each section.
Assuming I have not made majors errors, one can see at least four different patterns:
the two ch-words that are most frequent in HerbalA (chol, chor) have smaller and smaller frequencies has you move towards B;
symmetrically, there are words that are rare in A and progressively more frequent in B (cheey, chckhy);
there are words that do not appear in A and are frequent in B (chedy, chdy); this asymmetry could be useful in choosing the direction of the mapping A->B or B->A;
There's quite a bit of confusion going round.
Nick used the word 'parsing' correctly in You are not allowed to view links. Register or Login to view. , although there is a point to be made about transcription vs. transliteration. However, that's not my purpose here.
One can transliterate the Voynich MS text using any system, be it Eva, Currier, v101 etc. The result is a text file.
It is this text file that we want to parse as part of statistical analysis.
So:
- step 1 is to create a text file using some definition;
- step 2 is parsing this text file with the aim to figure out which are the real 'units' of the Voynich MS text.
For this second step I used to rely on BITRANS, a tool made in the 90's by Jacques Guy, but it was only available as a DOS command line tool, and still worked in early versions of Windows. It seems to be dysfunctional now, due to Windows evolutions, and I never saw a Unix / Linux version.
You are not allowed to view links. Register or Login to view. is Dennis Stallings' page pointing to a download.
This tool was perfectly fitted for the parsing task. It was used extensively in the definition of the Eva alphabet, and the creation of the interlinear file by Gabriel Landini and Jorge Stolfi.
Just to illustrate, the example given by Nick in the above-mentioned post:
Quote:Task #2: Parsing the raw transcription to determine the fundamental units (its tokens) e.g. [qo][k][ee][dy]
is easily done by defining substitution rules. BITRANS then allows to use these rules back and forth.
It also allows to define context-dependent rules, for example at start of words or end of words.
Now this tool seems lost, but I have been making good progress with a revival implementation. It does not allow multi-pass parsing, but it does support most other features that I used to find important.
The context-dependent rules are not needed when converting between different transliteration alphabets, but they are likely to play a role when parsing the results for interpretation of the text.
Just to give a simple example, the following definition file changes:
- 'con' at word starts into '9'
- 'us' at word ends into '9'. #con #9
us# 9#
It changes this text: consensus contract proconsul tempus couscous
into this: 9sens9 9tract proconsul temp9 cousco9
By specify the 'backwards' option, the same command with the same definition file changes the second back into the first.
The diagram, apparently originating with Anania Shirakatsi showing the eight phases of the moon, is - so far- the only source showing a 'wheel with eight curved spokes' like the VMs cosmos.
You are not allowed to view links. Register or Login to view.
David of Trebizond, last ruler of the country - conquered in 1461, was apparently in contact with Philip the Good, Duke of Burgundy, through his (David's) ambassador, Michael Alighieri.
You are not allowed to view links. Register or Login to view.
The red spiral on the right, down a bit, reminds me of the VMs cosmic illustrations.
You are not allowed to view links. Register or Login to view.
Some of the imagery in the Voynich, in particular You are not allowed to view links. Register or Login to view. image in the stars, are a possible reference to the Halley's Comet which appeared in 1456.
I was looking at something else and noticed that on the page You are not allowed to view links. Register or Login to view. there is a clear 2 (two).
Undoubtedly it has been noticed before by others, but I can not remember any discussion on it.
This 2, seems not a coincidence, because it is not only in the margin, it is also clearly in the text, on the 4th line from below.
I am very interested in your expert opinions (those who already have a solution on the text), explanations (those who already know that the text is a hoax or not), suggestions, other occurrences, etc.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.