(16-02-2021, 05:48 PM)Koen G Wrote: You are not allowed to view links. Register or Login to view.I am not sure to what extent this impacts our supposed ability to detect function words. Maybe the consistency of function words in a high-TTR text should make them easier to spot, if anything?
Hi Koen,
I think the only contribution to detection, if any, is setting a quantitative goal for token counts. It would be more helpful if there was a correlation with the number of function words among high-frequency word
types: e.g. the counts at the bottom of the "top 20" table in the previous post. But this is not the case: correlation is not significant (.21) with a flat regression line.
[
attachment=5299]
The You are not allowed to view links.
Register or
Login to view. puts forward some interesting ideas. High frequency types are more likely to be function words, but of course a few content words will also be frequent in most texts. One can compare different texts in the same language and only select as function words those that are very frequent in most texts: only frequent content words depend of the subject of the text.
For instance, compare the function words that are common to the English Genesis and the Grete Herball: six function words appear in the top 10 types in both texts (the, and, of, it, in, that). On the other hand, the two sets of frequent content words are totally disjunct.
This sounds good for the VMS, since the illustrations suggest that, say, the subject of Quire 13 is not identical to that of the Herbal. But of course the very basic problem is that, because of the differences in "dialects", there is very little overlap between the lexicon in different sections.
Greeting from Singapore.
I've known of VMS existence for over a decade, but it was only early last month that I decided to crack my head over these codes in my spare time. First thing I did was to spent time studying the pages and the characteristics of its words and 'alphabet' (instead of reading forums and other websites) so as not to get influenced. Only after a week did I read some of JKP's webpages and registered myself in this forum. (Ya...you guessed it - it had since sucked in me so much so I'm even creating my own transliteration and Voynich fonts!)
One thing I noticed was numerals are missing from the text, and side-by-side repeated words happen now and then, more so if we count in those near-similar words. This reminds me of the Malay Language, with with its complicated system of prefixes, suffix, infixes and circumfixes, and using repeated word for plural nouns. For example:
Malay: Saya telah ber
jalan dan men
jalankan
jalan-jalan ini sejak saya belajar berj
alan.
English: I 've been
walking and
running
roads these since I learned to
walk.
You can use Google Translate to get this same result. "Jalan" could mean 'walk' (verb) or 'road' or 'street' (noun). It gets repeated for plural nouns. And 'these roads' becomes "jalan-jalan ini" (where the modifier 'these' gets placed behind the noun).
Thus, it is possible repeated words are plural forms. Just sharing ....
Cheers!
MS Cheo.
(That's my name...any resemblances to Voynichese is pure coincidence) 
Thank you, mscheo!
What you write seems interesting to me, but I am not sure it is much related with "function words" in particular. I suggest moving your post another thread e.g. You are not allowed to view links.
Register or
Login to view.
Exact reduplication like 'jalan-jalan' also happens in the VMS (e.g. 'daiin daiin'). It involves almost 1% of consecutive word couples.
But partial reduplication (what I call quasi-reduplication) is about twice as frequent. This can be defined as consecutive words that only differ by one glyph. This is an EVA example with two occurrences:
<f103r.50,+P0> ssheey.l,shey.qol.
cheey.chey.
qokeey.okeey.qokain.cheey.qotain
Do you think that anything similar also occurs in Malay? I.e. that consecutive words can systematically be very similar, differing by a single character?
Also, do you know of any old Malay text that features frequent reduplication and is available as a downloadable file somewhere?
Hi Marco,
The Malay Language "switched" over from Jawi to Latin/Western alphabets in the last two or three centuries during the "spice trade era" and literacy was extremely low then (in south-east asia). But I guessed, phonetically, its structures and sounds should remain intact.
I studied Malay as my second language for 2 years in my early elementary school years decades ago before switching over to Chinese.
For a start, you can get a little understanding of its grammar structure here: You are not allowed to view links.
Register or
Login to view.
And I am sure googling around would give you some ancient Malay Text (probably in Jawi though), like here: You are not allowed to view links.
Register or
Login to view.
And oh, yeah...move my post to whatever thread it is more appropriate... I am still learning and exploring this forum.
Cheers!
MS Cheo.