I'd like some help from other members to clarify a point for the introduction to a book of essays.
I was unaware until fairly recently that the Friedman group had formed a general opinion that plants in the manuscript's botanical section were formed as composites.
Having discovered the fact independently - and presented an explanation of the way in which the pictures are structured, the system informing the 'pictorial annotations' at the roots and identified about forty folios' worth, I was later informed that the general idea had been stated in d'Imperio's book.
So my question is this.
Between 1912 and the publication of Mary's book in 1978
and
(separately)
between 1978 and when I began publishing my own work (from 2010)
had anyone ever looked into that issue, or defined any of the plants as composites, or explained any folio in that way?
Otherwise it looks as if d'Imperio's general statement was ignored for the first thirty years, and now that an independent investigation has provided explanation and demonstration - now that is being ignored, while the general statement is increasingly repeated...
Is that the case, or is there a precedent body of research that I've overlooked? Please rack your memories..
The predictability of glyph placement within label vords is in concordance with that of vords in the main corpus
Explanation
Some vords appear as "labels", single or double vords apparently identifying images within the manuscript. These labels have the same grammar as those vords in the main body of the corpus.
The text of the manuscript is divided up into clearly defined word-like glyph groups (dubbed vords on this forum). These glyph groups have a non-trivial internal structure which is manifest in the severe restrictions imposed upon the positioning of glyphs within the word groups.
In other words, Voynichese has a very strict phototactic structure – morphemes appear in predefined places within vords, and only there.
A morpheme is the smallest grammatical unit in a language.
Morphemes in the corpus are easily identifiable. Voynichese glyph combinations are very positional aware within vords – glyph groups are non-trivial in their internal positioning. We can identify, and have identified, a long list of suffixes and prefixes within Voynichese. We know that certain glyphs only appear as suffixes; we know that certain glyphs only appear as prefixes; and we know that other glyphs are free form. We have also identified (via the CLS theorem) that glyphs appear in a certain pattern.
We assume these are bound morphemes because they obey certain rules of positioning. (We can make no assumptions about words that do not include such bound morphemes as we are unable to identify a meaning for such unbound morphemes, but such vords are relatively few in nature).
And analysis of the labels (see links below) show that the corpus of labels has a notable level of concordance with the morpheme placement of vords in the main corpus.
Further reading
You are not allowed to view links. Register or Login to view. MarcoP on the Voynich.Ninja.
Quote:Summary: Marco found that almost 70% of all labels matched words in the main corpus. The rest were unique.
VMS language DNA variations. Davidsch
Quote:My research shows visually that the labels, as defined,
follow the same rules for the letters in the remainder of the text that are not labels, with some exceptions:
'a' occurs proportionally more in the "label text"
the 'q' (only posA) occurs much lesser in the "label text"
the 'h' occurs much lesser in the "label text"
the 't' on posB is higher in the "label text"
You can check by You are not allowed to view links. Register or Login to view. "CAB NST" & "CAB labels only".
You are not allowed to view links. Register or Login to view.. Prof. Stolfi
Stolfi notes [You are not allowed to view links. Register or Login to view.] when attempting to create a "grammar" for Voynichese that (italics mine):
Quote:It should be noted that that normal words [in his attempt to create a grammar] account for over 88% of all label tokens, and over 96.5% of all the tokens (word instances) in the text. The exceptions (less than 4 every 100 text words) can be ascribed to several causes, including physical "noise" and transcription errors. (Different people transcribing the same page often disagree on their reading, with roughly that same frequency.). Indeed, most "abnormal" words are still quite similar to normal words, as discussed in a You are not allowed to view links. Register or Login to view..
[..]
The words that do not fit into our paradigm [..] These words comprise 1295 tokens (3.7%) in the main text, and 127 tokens (12.4%) in the labels. The vast majority are rare words that occur only once in the whole manuscript.
TheYou are not allowed to view links. Register or Login to view. by Brian Cham and David Jackson describes how Voynich glyphs can be divided into three categories that interact with one another in a pre-defined manner.
Notes
Statement changed from "The morpheme construction of labels is in concordance with that of the main corpus"
Since the Poll thread has closed, I've started the thread to reply to the last comment made by Anton that
Quote:Anton:
After ...seven?.... years, nobody has yet been able to give me a concise professional definition of vellum
I thought I might pass on the British Library's concise definitions - accepting that customs differ in different countries and in different languages. In fact, our habit is to refer more generally to 'membrane' unless specifically comparing that in one manuscript to another to aid provenancing - e.g. the French pocket bibles' vellum with that in the Vms.
Brit. Lib:
Quote:PARCHMENT:
A writing support material that derives its name from Pergamon (Bergama in modern Turkey), an early production centre. The term is often used generically to denote animal skin prepared to receive writing, although it is more correctly applied only to sheep and goat skin..
Quote:VELLUM: the term vellum reserved for calfskin.
and UTERINE VELLUM:
Uterine vellum, the skin of stillborn or very young calves, is characterized by its small size and particularly fine, white appearance; however, it was rarely used.
Here again, customs differ. Our practice is to never write "calfskin" in that way; it is reserved - in our practice - for describing leather - e.g. a bag, a pair of shoes or book-binding is "calfskin" but membrane used for a manuscript''s bifolia is "vellum" or "calf-skin".
I notice that Helmut suggests that in his practice the term for 'vellum' has a much more restricted application than is found elsewhere - the British Library's descriptions being sufficient example.
I've recently repurposed my genetic algorithm code to use EVA rather than Voyn_101. The GA seems to do better with EVA, and I'd like to report an interesting result using Latin as a base language for You are not allowed to view links. Register or Login to view. (a folio I picked at random).
The way this works is that the GA reads in the EVA transcription for the given folio(s), line by line and word by word, and as it does so it creates frequency tables of all the ngrams it finds. Right now it uses ngrams up to 3 glyphs long.
It then reads in a very large Latin word list, to use as a validation dictionary.
It then prepares a set of Latin letters, nulls and scribal abbreviations, currently numbering around 60 items in total.
Then it randomly pairs each EVA ngram with one of the Latin letters, nulls or abbreviations, and using that pairing (called a chromosome in the jargon), applies it to all lines and words in the EVA, so as to produce new words in plaintext. Each plaintext word is checked for validity in the Latin dictionary, and scored. If the word is valid, it gets a high score. If the word is long, it gets a higher score. All the word scores are summed. If a consecutive sequence of valid Latin words appear, that causes the overall score of the chromosome to increase according to the length of the sequence. The idea here is to reward chromosomes that produce sequences of valid, long Latin words.
This random process continues over many pairings/chromosome and many generations, using selection between each generation to refine the pairings (I'll spare you the details!).
Here are details for one of the better results (with a score of over 22000):
A) The list of letters, nulls and abbreviations used is as follows:
B) The best chromosome of VM glyph pairing to the Latin ngrams in A), includes the following
[font=voynich] a = r[/font] [font=voynich] 8 = t[/font] [font=voynich] c = re[/font] [font=voynich] h = ur[/font] [font=voynich] o = er[/font] [font=voynich] y = tum[/font] [font=voynich] s = u[/font] [font=voynich] k = [font=Arial]est[/font][/font] [font=voynich] 9 = um[/font] [font=voynich] 8a = c[/font] [font=voynich] co = m[/font] [font=voynich] ii = <null>[/font] [font=voynich] 4o = in[/font]
(The remaining pairs are omitted for brevity.)
I found the 9 = um equivalence that the GA discovered to be striking (Brumbaugh claimed this equivalence in his solution), but I suppose it's sort of obvious.
B) The best pairing translates the following valid Latin words on f3r:
ycheor chor dam qotcham cham umterim ratum cum inque cercis
ochor qocheor chol daiin cthy erratum interim ratis da carum
schey chor chal cham cham cho uterum ratum certis cercis cercis ra
qokol chololy s cham cthol ius ratisusum u cercis carus
ychtaiin chor cthom otal dam umturestcarum ratum caro prtis cum
otchol qodaiin chom shom damo pratis inda racis iscis cumer
ysheor chor chol oky damo umsim ratum ratis coum cumer
I expect the Latin above makes no sense at all, but the "look and feel" of the word lengths and the vocabulary size I find encouraging.
I'd welcome suggestions of Latin abbreviations, prefixes and suffixes that I could include in (or remove from) the list in A) above (which I gleaned mostly from d'Imperio's summary of Cappelli).
Monas Hieroglyphica You are not allowed to view links. Register or Login to view. MS-408 Key
Well this may come as a shocker to all of you, for I have found the key to read proper sentences in the Voynich Manuscript. Theorem I of John Dee’s book commences with the word, “Per”, and I decoded 11 word tokens of the Voynich in a row for folio 1r using my cipher from his book. Take a look! The word, “At” breaks down to the next theorem in the following paragraph at the next voynich sentence.
With further research into You are not allowed to view links. Register or Login to view. I found a 15 digit sequence which runs counter clockwise. This is not an anomaly and it was purposefully done by the Author. The numbers are fifteen digits of Pi (i.e. 3.14159265358979). The inner circle contains the first six digits of Pi while the outer contains the next nine digits of Pi which follow in sequence.
And with corrections from which JKP noticed this was the outcome of the first six number of Pi
I know that we have a You are not allowed to view links. Register or Login to view. on gallows characters but I have a theory about benched gallows (cKh, cTh, cPh, cFh) that has statistical support.
In my You are not allowed to view links. Register or Login to view. I showed that <qo> is followed by <k> or <t> 80% of the time in the VMS corpus (4 out of 5 times). It is followed by <l> only 5% of the time, <e> 2% of the time, and <ch> less than 1% of the time. So, there is a clear pattern: <qo> comes in front of a gallows character.
There are a very small number of times where <qo> comes before a benched gallows like [cKh] or [cTh]. So there are three options for interpretation:
1) The benched gallows can be deconstructed as <qokch> or <qotch>, which makes statistical sense (qo before gallows is normal).
2) The benched gallows can be deconstructed as <qochk, qocht> or <qoeke, qoete>, but all of these combinations together only appear 9 times in the VMS.
3) The benched gallows are independent sounds, but they rarely follow <qo> (about 1.7% of the time, or less than 1-in-50 times, that <qo> is used).
EDIT: Anton proposed a fourth system You are not allowed to view links. Register or Login to view. (which may or may not work in conjunction with #1 or #2)
The first option is the most statistically likely, and in addition, there are plenty of times where a benched gallows and its [gallows+ch] equivalent are interchangeable, which also lends support to option 1:
So, I would conclude that the benched gallows are really [gallows followed by ch], but I could be wrong and I'm open to counter-arguments