In this video I will show you a hidden cartoon when you overlay 2 pictures. This is 10 videos.
You are not allowed to view links. Register or Login to view.
I was recently thinking about Palindromes (words that read the same backwards and forwards).
All European languages (that I know of) have single word palindromes, but this effect seems to be almost absent in the VM.
You are not allowed to view links. Register or Login to view. are simple three letter words.
The only longer palindromes seem to be unique in all cases with the exception of occo, which appears three times in the manuscript.
Here are the ones I've spotted (quite possible I've missed some, this was only a quick count)
dydyd (f1)
seees (f3)
oeeo (f6v, f72v2, f101v2)
ykaky (f55v)
yekey (f69v)
ylaly (f73v)
lolol (f72v)
shchs (f113r)
The low number of palindromes is of course to be expected, due to the position awareness of glyphs.
It's possible that such palindromes are actually the result of misspellings, and this could give us some concrete examples of such scribal errors within the corpus, allowing us to correct scribal errors and reduce erroneous words from the transcription.
For example, taken at random because it made me laugh out loud, lolol: lo appears 15 times by itself, 182 as a word initial and ol 3052 times as a word final and 538 times as a word by itself. But lo*ol only gives two results, lolol and lolkeol. This second word is more likely to be two words run together, as both lol and keol are common words. This suggests to me that lo and ol have well defined functions, but shouldn't be used together; the scribe made a mistake with lolol, and missed out a space in lolkeol. What mistake in lolol? Well, the prior word is checkho, which is unique. If we move the first l over, then we get checkhol, which appears twice in the corpus. We now have checkhol appearing three times, followed by olol, which appears 18 times in the corpus.
So we have now removed three unique words from the corpus in a logical manner!
No idea if we can do this with the rest of them, it's getting late and I'm tired now. Has anyone any research into this angle, or into reducing the number of unique words by seeing if they can be exploded and reassembled with adjoining words?
This is a subject that has been put forward by Nick Pelling, for instance You are not allowed to view links. Register or Login to view..
NickPelling Wrote:Koen: I’ve been saying for some time that I think the next big “step up” in Voynichese study will come when some clever person finds a way to map between A patterns and B patterns, i.e. to normalize the two (errrm… actually several) parts into a single thing.
But to do this properly, you need to parse A and B, build letter contact tables for them, and then build state machine ‘grammars’ that capture how the two behave – the stuff that’s the same is probably the same, but the stuff that’s different probably involves something that was written as XXX in A being written as YYY in B. Normalizing A/B would involve being able to say “XXX == YYY”. However, this rests on the back of parsing, letter contact tables, and state machines, which (I think) steganographica tricks are disrupting. So I’m still not at all sure how we get over all the technical hurdles to get to a state where we can approach this in a rigorous enough way.
But perhaps some of these XXX == YYY equivalences can be worked out even without all that machinery. For example, I have long strongly wondered whether daiin daiin patterns in A reappear (in some way) as qotedy qokedy patterns in B. Clearly, both involve repetitive “bla-bla-bla” word sequences that are hard to reconcile with either linguistic readings or crypto theories. And given that I’ve previously speculated whether daiin daiin might be enciphering Arab numerals, it would be logical for me to speculate whether qotedy qokedy might be doing the same (but in a different way). Just a thought.
I understand that the subject is extremely complex and I doubt I can contribute much. But I think that Nick has described a promising area for further research and it could be interesting to discuss ideas and possible approaches, even if there is not much hope that we can make serious progress.
My admittedly superficial take to the problem would be to see it as some kind of optimization: find the set of N rewrite rules converting A into B (or vice-versa) so that some measure of the difference between A and B is minimized.
Even this simplistic approach poses a few questions e.g.:
* how to represent Voynchese? (as a first step, I would just experiment with a few different transliteration systems, e.g. EVA, Cuva, Currier)
* how many rewrite rules should be defined? (this is another area where one can experiment with different values for N)
* should one map A into B or vice-versa?
* is it better to compare the whole of A vs the whole of B, or to just consider the more "extreme" sections, e.g. mapping HerbalA into Bio? what to do with the intermediate Astro / Cosmo / Zodiac sections?
* how to measure the difference to be minimized? bigram histograms? word histograms? frequency of repeating word-combinations (which could address the daiin/qokedy issue mentioned by Nick)?
Torsten recently You are not allowed to view links. Register or Login to view. a table of words that seems to me a way to get some "feel" for what is going on. His table "lists the four most frequent 'ch/sh'-words for different sections". He describes the phenomenon as "the shift from 'chol/chor' via 'cheol/cheor', 'cheo/sheo', 'chey/shey' to 'chedy/shedy'".
I expanded on the idea, focussing on ch-words only and extracting the 30 most frequent word types in each section. I used the Zandbergen-Ladini transcription, ignoring uncertain spaces and text-only pages; I joined Astro / Cosmo / Zodiac pages into a single section. Sections are sorted from "strongly-A" to "strongly-B", as discussed by Rene at the end of You are not allowed to view links. Register or Login to view.. For each word, I include the % of occurrences in each section.
Assuming I have not made majors errors, one can see at least four different patterns:
the two ch-words that are most frequent in HerbalA (chol, chor) have smaller and smaller frequencies has you move towards B;
symmetrically, there are words that are rare in A and progressively more frequent in B (cheey, chckhy);
there are words that do not appear in A and are frequent in B (chedy, chdy); this asymmetry could be useful in choosing the direction of the mapping A->B or B->A;
There's quite a bit of confusion going round.
Nick used the word 'parsing' correctly in You are not allowed to view links. Register or Login to view. , although there is a point to be made about transcription vs. transliteration. However, that's not my purpose here.
One can transliterate the Voynich MS text using any system, be it Eva, Currier, v101 etc. The result is a text file.
It is this text file that we want to parse as part of statistical analysis.
So:
- step 1 is to create a text file using some definition;
- step 2 is parsing this text file with the aim to figure out which are the real 'units' of the Voynich MS text.
For this second step I used to rely on BITRANS, a tool made in the 90's by Jacques Guy, but it was only available as a DOS command line tool, and still worked in early versions of Windows. It seems to be dysfunctional now, due to Windows evolutions, and I never saw a Unix / Linux version.
You are not allowed to view links. Register or Login to view. is Dennis Stallings' page pointing to a download.
This tool was perfectly fitted for the parsing task. It was used extensively in the definition of the Eva alphabet, and the creation of the interlinear file by Gabriel Landini and Jorge Stolfi.
Just to illustrate, the example given by Nick in the above-mentioned post:
Quote:Task #2: Parsing the raw transcription to determine the fundamental units (its tokens) e.g. [qo][k][ee][dy]
is easily done by defining substitution rules. BITRANS then allows to use these rules back and forth.
It also allows to define context-dependent rules, for example at start of words or end of words.
Now this tool seems lost, but I have been making good progress with a revival implementation. It does not allow multi-pass parsing, but it does support most other features that I used to find important.
The context-dependent rules are not needed when converting between different transliteration alphabets, but they are likely to play a role when parsing the results for interpretation of the text.
Just to give a simple example, the following definition file changes:
- 'con' at word starts into '9'
- 'us' at word ends into '9'. #con #9
us# 9#
It changes this text: consensus contract proconsul tempus couscous
into this: 9sens9 9tract proconsul temp9 cousco9
By specify the 'backwards' option, the same command with the same definition file changes the second back into the first.
The diagram, apparently originating with Anania Shirakatsi showing the eight phases of the moon, is - so far- the only source showing a 'wheel with eight curved spokes' like the VMs cosmos.
You are not allowed to view links. Register or Login to view.
David of Trebizond, last ruler of the country - conquered in 1461, was apparently in contact with Philip the Good, Duke of Burgundy, through his (David's) ambassador, Michael Alighieri.
You are not allowed to view links. Register or Login to view.
The red spiral on the right, down a bit, reminds me of the VMs cosmic illustrations.
You are not allowed to view links. Register or Login to view.
Some of the imagery in the Voynich, in particular You are not allowed to view links. Register or Login to view. image in the stars, are a possible reference to the Halley's Comet which appeared in 1456.
I was looking at something else and noticed that on the page You are not allowed to view links. Register or Login to view. there is a clear 2 (two).
Undoubtedly it has been noticed before by others, but I can not remember any discussion on it.
This 2, seems not a coincidence, because it is not only in the margin, it is also clearly in the text, on the 4th line from below.
I am very interested in your expert opinions (those who already have a solution on the text), explanations (those who already know that the text is a hoax or not), suggestions, other occurrences, etc.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
It is known that Eva-f and Eva-p appear most commonly on top lines of paragraphs, but how many exceptions are there?
Using the capabilities mentioned in You are not allowed to view links. Register or Login to view. , it was easy to check.
I used my own transcription (version 1c), and removed uncertain spaces. For alternative readings I selected the first.
Any rare characters that look (a bit) like f and p were ignored, but cPh and cFh were included.
Beside normal paragraph text, these characters also appear in labels, and to some extent in text in circles. The whole text was therefore grouped into three categories:
- First lines of paragraphs
- All other lines of paragraphs
- All other text
The entire text has been represented into 5389 text items or 'loci'. Of these, there are:
723 first lines of paragraphs
3407 other paragraph lines
1259 other types
These loci have different lengths, so the statistics are based on word tokens. I add a screen shot of an Excel file:
The column 'all' gives the count of all word tokens, and the next two the count of tokens including at least one f or one p.
The row 'P-1' gives the first lines of paragraphs and 'P-n' all other lines. The line 'Other' gives all other loci.
Overall, there are about 3 times as many p-words as f-words. Their distribution in the non-paragraph text is rather similar to the overall text of the MS, but for the paragraph text the know behaviour is quite pronounced. The percentage is more than a factor 10 higher.
However, the exceptions are not rare. There are almost 400 occurrences on later lines in paragraphs, which is again something that requires an explanation. These are almost certainly not just mistakes, as if their appearance there would be 'forbidden'.
How this compares between A and B languages is a next step.
Some time ago, I defined a format similar to the interlinear file format, but which can also host the GC transliteration.
I represented most common historical transliterations in this format. I also updated my own tool to process such files.
However, there were some things that were still not easy to do.
Historically, people recorded the ends of paragraphs in these files, but it is of interest to be able to do separate statistics for first lines of paragraphs. I decided to introduce a new dedicated comment for this, and added this to my own transcription file.
All links can be found on these two pages:
- You are not allowed to view links. Register or Login to view.
- You are not allowed to view links. Register or Login to view.
Here are the most relevant ones:
- You are not allowed to view links. Register or Login to view.
- You are not allowed to view links. Register or Login to view.
- You are not allowed to view links. Register or Login to view.
With this, I could finally check the real preference of p and f on top lines of paragraphs quite easily. I will post about that next. There is another quite interesting area of statistics that is now possible, which will take a bit more time.
I recently read an obituary to James Robert Child You are not allowed to view links. Register or Login to view. . It said that he had been working on the VMS for decades ( as a Linguist ).
Quote:He spent decades, in particular, analyzing and translating a 15th century handwritten and illustrated codex known as the Voynich manuscript, working with a colleague into his 90s to create and update a website.
I didn't know about this work. Now I see his website for the first time. The adress is:
You are not allowed to view links. Register or Login to view.