All this linguistic talk has made me think of a thought experiment.
For this thread, I am going to postulate the following:
- Voynichese can be pronounced
- What is more, Voynichese is a created script that was first spoken then written
- This is an Indo-European daughter language (an artificial constraint I include simply to reduce the possibilities. If we don't get anywhere, we change this point and start again)
If this is true, then can we take Bax's method and apply it from a different angle - instead of trying to identify nouns, we try to identify function words?
Quote:Function words (also called functors)You are not allowed to view links. Register or Login to view. are You are not allowed to view links. Register or Login to view. that have little You are not allowed to view links. Register or Login to view. You are not allowed to view links. Register or Login to view. or have You are not allowed to view links. Register or Login to view. meaning, and they express You are not allowed to view links. Register or Login to view. relationships with other words within a You are not allowed to view links. Register or Login to view., or specify the attitude or mood of the speaker. They signal the structural relationships that words have to one another and are the glue that holds sentences together. Thus, they serve as important elements to the structures of sentences. //Wikipedia[/url]
Now, function words have the almost universal tendency of being short in nature. English has the "three letter rule", in which function words generally have fewer than three letters (i, am, is, of) and content words have three or more. The Cervantes Institute [url=https://cvc.cervantes.es/ensenanza/biblioteca_ele/aepe/pdf/boletin_34-35_18_86/boletin_34-35_18_86_24.pdf]notes that similar happens in Spanish (
a,con,para,de,por although exceptions such as
parecer, luego exist). The same effect exists in many different languages.
Why? Without delving into the theory, language erosion. Function words are very common and people have the tendency to shorten them over time, to expend less effort.
In
The naïve language expert Drs Claudia Männel & Jutta L Mueller note that many European languages are
functor initial - the function extends the sentence (I am going to Rome) (
die Amyrillis blüht auf [the flower bursts into bloom]), etc.
To cut a long conversation short: can we identify functors, short words that appear to be giving function to the following words?
Let us take a lexical page, ie, 104r, from the Voynich extractor, transcription H (T.T.).
I identify all words shorter than three glyphs in length. I discard any words with minims in - so
aiin is arbitrarily discarded.
// indicates that longer words appear. - indicates a line with no potential functors
Char chey
oar // chey
oky
ol chry
ol.chl.ol // al.lod // qod
-
-
-
dam
tol // os.l.air. shdy
lo.sar.al.
Chdy
cheo.lor.saiin.// lo
sor
-
-
-
or
shd
chol.rar.///ar.ai!n.ar.
-
rl.shed // dam
ol.sheo // chol.// chol.// aiir.chol.kar.
Char
sar.// l. // ar
shar // kar
.dl.ral.// ar.
or.// or.char.// sor.or.aiin.
or.air
-
-
-
char.// chey.
-
chey.chol.cheol.
-
-
-
ar.// chol.
al.ly
dar
y.
dal
ar.
l.s. // ar.
And here are all the extracted functors, ranked by frequency of appearance:
chol chol chol chol chol cheol
chey chey chey chey
[font=Eva]Char char Char char[/font]
ar ar ar ar
ol ol ol ol
al al al
or or or
os or or
l l l
ar ar ar
sar rar sar
dam dam
air air
lo lo
cheo
Chdy
chol
shdy
shd
shar
sheo
sor
oar
oky
chry
chl
lod
qod
dal
dl
dar
y
tol
kar
kar
ly
sor
ral
rl
shed
lor
saiin
s
The most popular are all variants of one another -
chol / [font=EVA Hand 1][b]chey / [font=Eva][font=Eva][b]char[/font][/b][/font]
ar /
ol al / [font=EVA Hand 1][b]or /
os /
ar /
sar
[/b][/font][/b][/font]
So what are these words? Now we move into the word of fantasy.
The most common words in European languages tend to be short indication words. Here's the You are not allowed to view links.
Register or
Login to view.:
6.18% the
4.23% is, was, be, are, ’s (= is), were, been, being, ‘re, ‘m, am
2.94% of
2.68% and
2.46% a, an
1.80% in, inside (preposition)
1.62% to (infinitive verb marker)
1.37% have, has, have, ‘ve, ’s (= has), had, having, ‘d (= had)
1.27% he, him, his
1.25% it, its
1.17% I, me, my
0.91% to (preposition)
0.86% they, them, their
And other European languages that I've quickly looked up are basically the same (although Roman languages have pronoun propositions up at the top as well).
Now, 104r contains 438 words. The most common word, Chol, appears 6 times (1.36%) which is way below the average English frequency. But we don't know what this page is about. A dry medical text will not contain main indicators and it appears to be obvious that the text does not run in a "we take it and we dry it and we pound it and we stick it" format.
The words tend to be clustered in groups upon the page. If I were to guess, it is almost as if a word crops up and is repeated several times within the same topic. Or appears three times in the same line and is a suffix for other words in the same line:
olcheear chedar or aror!sheey olkeechy or char cheeol sor or aiin ot!am
Let's think of a different angle of attack. Can we fit these proposed functors as suffixes? In other words, are they functor final - acting upon a stem? Could the stem be a verb with the functor acting upon them?
Well:
chol, which appears six times by itself, merges with 8 other words:
cholfor okechol chol!cham pchol cholxy qokchol pcholor cholkar
chey appears four times by itself and a further five times as a larger word.
dam 2 / 5. char 4 / 7. sar 2/ 2. air 2 / 11.
The two glyph words are really common as part of larger vords, but this can partially be discarded simply because we know they are common.
ar appears 57 times; or 65; etc.
Sadly I have to cut this experiment short here for time reasons - I'll post it here to see if anyone has any feedback. Always a dangerous thing to do!