(11-12-2022, 03:53 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.Although I find many of the words readable in Greek, the combinations of the glyphs pch - ph=f and kch- nk are written in the Latin way.
You asked whether there are any "clues that could exclude ancient Greek," and I take it you're looking mainly for any evidence against a solution in which Voynichese words are equivalent to Greek words and Voynichese glyphs correspond at least loosely to Greek plaintext characters.
The points others have raised about entropy and so forth are relevant, but let's try a different approach.
According to You are not allowed to view links.
Register or
Login to view., the twelve most common words in a large sample corpus of Greek texts -- together with their token counts in it -- are:
καὶ ["and"] 4129066
δὲ ["but"] 1501550
τὸ ["the"] 1414996
τοῦ ["the"] 1140938
τῶν ["the"] 1051317
τὴν ["the"] 993011
τῆς ["the"] 849596
ὁ ["the"] 831492
ἐν ["in"] 795289
γὰρ ["because"] 687117
τὸν ["the"] 679309
τὰ ["the"] 627063
If we take any text of significant length in grammatically and stylistically "normal" post-Homeric Ancient Greek and work out what its most frequent words are, we should expect the results to resemble these, at least approximately: the single most frequent word should be somewhere around 2.75 times as frequent as the next-most-frequent word, and five of the seven (or so) most frequent words should all begin with the same glyph (
τ), which should also be different from the beginning glyph of the most common word of all (κ).
So let's consider the running text in Currier B. The top twelve words are:
chedy 429
Shedy 361
daiin 316
qokeedy 301
ol 289
qokedy 269
qokain 261
qokeey 252
qokaiin 241
aiin 232
chey 208
ar 197
I don't find tentative Greek readings for the most common Currier B words on your blog, but those words sometimes appear as parts of longer words for which you have proposed readings. Thus, if You are not allowed to view links. Register or Login to view., then [chedy] should be something like γειται; if You are not allowed to view links. Register or Login to view., then [shedy] should be something like σκεθην; and if You are not allowed to view links. Register or Login to view., then [daiin] should be something like των. Of these, των is the only match for a word in the top twelve, but [daiin] is the only top-twelve word that begins with [d], whereas quite a few of the top twelve words in Greek begin with [τ]. It would be a very strange form of Greek indeed where the two most common words begin something like "get-" and "sket-"; this would be a bit like finding a 30,000 word text in English in which the two most frequently occurring words are not "the" and "of," but "blip" and "bloop."
On the other hand, if we wanted to force the Voynichese forms onto Greek forms, we could hypothesize that [chedy] = [Shedy] and that both represent the word καὶ. The token ratio of those two Voynichese words to the next most common word, [daiin], would then be nearly right (2.5 to 1 as compared to 2.75 to 1). Then [daiin] could represent δὲ, and all those common words starting [qok-] ([qokeedy], [qokedy], [qokain], [qokeey], [qokaiin]) could represent similarly common words starting in Greek with [τ] (τὸ, τοῦ, τῶν, τὴν, τῆς). Then perhaps [ol] = ὁ and [aiin] = ἐν.
That's starting to look pretty convincing, eh?
Except that, for consistency, [qokedy] really ought to end the same way as [chedy]/[Shedy] -- hence, ταὶ. And [qokaiin] should end the same way as [daiin] -- hence, τὲ -- while [aiin] should be just ὲ. But for the sake of argument, let's hypothesize that the glyph sequences [edy] and [aiin] can each represent more than one plaintext value, and that the words [qokedy] and [qokaiin] in fact represent τὸ and τοῦ. Alas, then we'd be faced with another problem: passages in which the Voynichese words appear side by side wouldn't seem to make much sense. For example, [qokeedy qokeedy qokedy qokedy qokeedy] would translate to "the the the the the."
Should we place any weight on these "clues"?
Meanwhile, we could also compare the results we get by applying particular forms of analysis to known negative cases -- that is, cases in which we know that a script does
not match a given language. So, for instance, we might test the hypothesis that
modern Slovenian is a system for writing modern Turkish. What happens if we look at the vocabulary of Slovenian and see if we can find matching Turkish words?
Slovenian [biti] = Turkish [biti], "his louse"
Slovenian [in] = Turkish [in], "den, cave"
Slovenian [do] = Turkish [doğa], "nature"
Slovenian [od] = Turkish [od], "fire, poison"
Slovenian [jaz] = Turkish [yaz], "summer"
Slovenian [v] = Turkish [ve], "and"
Slovenian [imeti] = Turkish [imdi], "now"
Slovenian [to] = Turkish [tuğ], "horse-tail crest"
Slovenian [on] = Turkish [on], "ten"
Slovenian [ne] = Turkish [ne], "what"
... and so forth
This seems to be working pretty well. So should we conclude that we're probably on the right track? If not, why not?