Further adventures in templates for Voynichese.
We can create a 12 x 12 matrix from only two words, set in cycles, with a few simple rules, that will accommodate any Voynich word whatsoever. In fact, any portion of Voynichese, necessarily so.
You are not allowed to view links.
Register or
Login to view.
It occurred to me that single 'e' cannot be generated by this method, but from an example it appears that you also have a rule that 'ee' can be replaced by 'e' (and probably 'eee').
I do see a bit of a problem, stemming from an observation by Massimiliano Zattera.
Your proposed method generates far more invalid words than valid words, and the ratio between the two is one of the metrics of the quality of such grammars.
I very much like these types of experiments though.
There is probably a thread about this, but how do we define an 'invalid' voynichese word ?
(05-05-2024, 12:57 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.There is probably a thread about this, but how do we define an 'invalid' voynichese word ?
In the present context (and how I meant it) it is just a word that does not occur in the Voynich MS as we have it.
(06-05-2024, 05:58 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view. (05-05-2024, 12:57 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.There is probably a thread about this, but how do we define an 'invalid' voynichese word ?
In the present context (and how I meant it) it is just a word that does not occur in the Voynich MS as we have it.
A corollary is that any hapax legomenon would be an invalid word if the page it appears on happened to be missing. It might make more sense to refer to "unattested" words, but the term "invalid word" (in the sense Rene meant) seems to be pretty well entrenched by now.
I suppose it could be useful to consider a further category of words that don't occur in the VM as we have it, but that at least conform to the same patterns as other words that do occur in it. For example, I'd suggest that [qokeeokeedy] follows reasonably well-attested patterns (in spite of the double gallows), whereas [iinqeeadk] doesn't. But unless I'm missing something, it looks like [qokeeokeedy] and [iinqeeadk] would both fit the proposed Universal Template equally well.
On the other hand, I could imagine the Universal Template -- with its twelve columns -- forming a nice basis for mapping Voynichese text to musical notes in a chromatic scale.
(07-05-2024, 03:42 AM)pfeaster Wrote: You are not allowed to view links. Register or Login to view.A corollary is that any hapax legomenon would be an invalid word if the page it appears on happened to be missing. It might make more sense to refer to "unattested" words
I fully agree with both. I wrote "the Voynich MS as we have it" for that reason, and the number of unattested, 'valid' words is likely to be quite large. Just the two missing folios f109 and f110 would provide many already.
One way to create likely valid, yet unattested, words is to remove all spaces and then take chunks from the resulting stream and see if these are attested words.
I am not sure if this would be a useful exercise...
I think something similar has been done to check whether label words are mostly words in the plan text, i.e. surrounded by spaces or not.
A further installment, attempting to impose some order on the template:
You are not allowed to view links.
Register or
Login to view.
This keeps on raising interesting questions. (Well, I find them interesting).
While the word: "qokeeokeedy" is not attested, the similar word "qokeokedy" is:
You are not allowed to view links.
Register or
Login to view.
That begs the question: can we come up with some non-subjective way to define valid and invalid words?
One way to define invalid is if it includes unattested bigrams or trigrams, or more than one extremely rare bigram or trigram. This may not lead to a definition whereby all words that are 'not invalid' would be 'valid'.
And:
what is the most likely valid word that is not attested?
This is a tricky question though. Imagine we'd define valid words in English in the same way. What you'd basically be saying is "as long as the rules of phonotactics are followed, it's fine." Basically fill out the vocabulary with all possible combinations of letters the language allows.
Now Voynichese isn't a language like English, so the term phonotactics may not even be relevant. But still, if you were to count everything that could potentially be formed according to existing bi/trigrams, you're also assuming that there was no good reason why those words were not included in the first place. You'd be saying "they might as well have been there".
The comparison with English doesn't really hold. The only reason we are talking about word generation methods at all (slot method, square method, etc) is because there is this word paradigm with very restrictive rules.
The whole question is more complex of course. One could argue that the Voynich MS vocabulary actually includes invald words. Stolfi more or less does that.
This means that all four possibilities of: attested or not; valid or not; could exist.
My interest in these questions is just to see how we can find out the actual rules that were used to generate the text.