Hi Crwiin,
this idea is extensively discussed in the recent paper by Bowern and Lindemann
"The Linguistics of the Voynich Manuscript" (You are not allowed to view links.
Register or
Login to view.). If you are seriously interested in understanding more of Voynich research, I strongly suggest you read that paper. The drawback for a newcomer may be that it is long and literally packed with information.
Here is my personal point of view about what the paper says. If you take the time to read it with care and form your own opinion, you will be much better off.
Bowern and Lindemann conclude that
"the character level metrics show Voynich to be unusual, while the word and line level metrics show it to be regular natural language and within the range of a number of plausible languages."
As JKP wrote, the unusual, very rigid, structure of Voynichese words has been You are not allowed to view links.
Register or
Login to view. by Jorge Stolfi; You are not allowed to view links.
Register or
Login to view. has proposed an alternative model that, in my opinion, is more compact and easier to understand. Until now, nobody has been able to find a natural language with a similar structure. Bowern and Lindemann suggest that character patterns are not language-like.
According to these conclusions, one could focus on higher level patterns that include more than a single word (e.g. repeating couples of words like the English "in the"). Here we face the problem that the lexicon in the manuscript is not uniform. This is sometimes described as two different languages (known as Currier A and B) but actually it appears to be even worse: a continuous drift where words change from section to section. This drift makes it impossible to find candidate function words for the whole manuscript: very few words are reasonably frequent everywhere (see for instance the table at the end of You are not allowed to view links.
Register or
Login to view.). As if in English you had "in the" in one section and "on they" in the next section; no section contains both "in" and "on", no section contains both "the" and "they". This example is somehow exaggerated, but it should give you an idea of the problem. Working with word patterns is only possible with long texts, but if one has to work with a single section at a time it might be that there is not enough consistent text to find anything.
The most prominent word patterns that we do find in the manuscript are just as puzzling: they are "reduplication", i.e. the exact repetition of the same word (e.g. "You are not allowed to view links.
Register or
Login to view.") or consecutive words that are almost identical (something I call "quasi-reduplication"). The two forms of reduplication often appear together:
E.g.: <f75r.38,+P0> You are not allowed to view links.
Register or
Login to view.