The Voynich Ninja

Full Version: It must be a language - because how much about linguistics did they know so long ago?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
One of the most interesting things about this manuscript is that it follows certain hard rules about texts and corpus. Specifically with reference to things like hapax legomena, which I find fascinating in and of itself (I used hapax legomena for a small section of my master's thesis).

It has always made me wonder: how much of the rules of natural language was known in 14th century HRE/Central Europe? 

Like, suppose it is all a big hoax, would the hoaxer have been aware that all corpus have a set amount of hapax legomena? Would he have planned for it in commissioning the book? Would he have expected his buyer to look for hapax legomena, and that the potential buyer might have looked at the text and gone: "Hmrph, not enough hapax legomena, clearly not a language." 

If that wasn't known, then there's no way that it would happen accidentally, right? 

How much do we know about what they knew about conlangs in the 14th century?
I'm no expert on this (I'm sure others will weigh in) but there are certain statistical properties that are likely to emerge no matter what the cause it. For example Zipf's law applies to many things like city sizes, and might emerge "by coincidence".

I agree though that adherence to largely invisible (and as you say, at the time unknown) statistical features cannot have been on purpose.
Some statistics on words like the hapax % or local type-token ratio (MATTR) depend very much on an accurate transliteration (that we don't have): especially spaces are unreliable.