The Voynich Ninja - Can VM be written in vowelless Latin?

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14

(24-07-2018, 08:58 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.I am grateful for this conversation. I now have a simple tool based on Rene's idea that I can run to evaluate any solution claiming to be based on Latin. I guess it could come handy in the future.

This list of 10000 Latin words contains some accentuated versions of common words and some Greek words (in UTF-8). No big deal, can be fixed quickly.

Instead of averaging the positions in the word list, you might want to compute an upper bound of the probability of presence of the words together in the text being tested, simply by multiplying the probabilities of the words (or adding the logarithms), assuming the corpus is big and diverse enough to give a good estimation of the probability of presence of words in a generic Latin text. You are not allowed to view links. Register or Login to view.

Then, finally, a measure of how well the histogram (of letters and bi- or trigrams) compares to the corpus could be useful, to take into account a possibly abnormal bias, like missing or over-represented letters/bigrams/trigrams. For example, a text containing only the word "et" should not get a high probability.

(24-07-2018, 08:58 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.
(23-07-2018, 07:06 PM)farmerjohn Wrote: You are not allowed to view links. Register or Login to view.That's clearly not Classic Latin and not Medieval Latin. So to compare them we need transform lists.
For example if we compare most frequent words of English and Russian (my native language), we immediately remove articles like the and a and auxiliary verbs from one list and merge different forms of the word in another list.
For Classic vs "fake" we need similar amount of work.

But ok, imagine we have done this. Top 10 Classic is among top 30 Voynichese and vice versa.
Does that proof something? No
Does that disproof something? No
Does that help to correct errors? No

Hi Farmerjohn,
from an analytical point of view, Rene's test counts as evidence of how a "solution" compares with the target language. I don't know if it is a "proof", but it certainly is a relevant quality index. From this evidence-based point of view, your current solution is clearly dismissed. A formalized transformation resulting in matching the top 10 Latin words with the top 30 fake words and vice-versa would be a huge step forward: your fake could look much less fake, at least from the lexical point of view. Obviously, this would imply the total replacement of bogus words like "cartellus", "partellus", "eare" etc with words from the actual top 30 Latin words. Doing this would be "correcting errors".
Honestly, I don't think there is any possibility of matching Voynichese and Latin (different word-length histograms, different entropy values, reduplication frequent in the VMS and absent in Latin, preponderance of variable prefixes in the VMS vs variable suffixes in Latin etc). If you have a concern with economizing resources, Latin is a very poor choice.

I still think that a thread about "a list of requirements" could be a good idea: if you start it, I will contribute my opinion in due time.

I am grateful for this conversation. I now have a simple tool based on Rene's idea that I can run to evaluate any solution claiming to be based on Latin. I guess it could come handy in the future.

MarcoP,
Obviosly matching 10-to-30 will be a huge increase in size of this thread with a lot of maybes, looklikes, possibles...
No. It's either is solution, or is not, working for looklike is not a good aim.

Yet more. When 10-to-30 is done you have to match symbols. What would you do if they don't match? (By the way I think matching symbols is much more important than matching words)

You will probably laugh, but all these "cartellus", "partellus" will remain forever. Diminutive suffix is the core of my idea. From that it follows immediately that ain-family represents vowels.

If I recall correctly ad+noun may replace Accusative. So ad may change its position in list significantly. How would your tool deal with these situations?

(24-07-2018, 09:46 AM)farmerjohn Wrote: You are not allowed to view links. Register or Login to view.If I recall correctly ad+noun may replace Accusative. So ad may change its position in list significantly. How would your tool deal with these situations?

Maybe you don't remember correctly, or you just expressed yourself poorly.

Anyway, if you really care about how Rene's test performs, you have all the information needed to check yourself. Use any true Latin text that exhibits the phenomenon you are referring to.

If you really don't care, why do you ask?

(24-07-2018, 10:20 AM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.
(24-07-2018, 09:46 AM)farmerjohn Wrote: You are not allowed to view links. Register or Login to view.If I recall correctly ad+noun may replace Accusative. So ad may change its position in list significantly. How would your tool deal with these situations?

Maybe you don't remember correctly, or you just expressed yourself poorly.

Both

Thats from Vulgar Latin by Herman:

The most important of these alternative expressive mechanisms
involved the use of prepositions. Even in Classical times, some inflections
and prepositional phrases were either entirely or partly equivalent;
for "to send a letter to someone" they could either use the dative, as in
mittere litteras alicui, or the preposition ad and the accusative, as in mittere
litteras ad aliquem.

MarcoP, please don't start that again... That's your test and I'm just asking how this test deals with marginal cases. I don't find it useful for me at the moment, but it can be in the future. Besides that methodology and algorithms can be interesting for all forumians.

Small update, mostly because of added option for EVA-sh glyph which gives plenty of interesting opportunities.
Also I have finally learned how to add pictures to tex. Thanks to it work becomes a bit more readable and much more plausible.

The year is steadily coming to an end, so here is my small review/roadmap/list of fantasies. Especially because a year ago I gave myself 2 years more to work with VMS so I'm halfway through this distance. And it will be funny to reread it later.

1. Current version of work is 2.7. 2.* is about lines, 3.* will be about paragraphs, 4.* about pages. The problem is that the way 3.* -> 4.* is of course much shorter than 2.* -> 3.0

2. Dividing into sentences is the hot question at the moment. Current hypothesis is that any oddity in the beginning or ending of the word marks sentence ending. For example words ending with gallows, EVA-s, d, m, g; words beginning with gallows, words beginning with long EVA-q or EVA-oq; long space; probably words beginning with letter+small space and ending with small space+letter; all these generally are markers for new sentence.

3. It seems that one of the core questions of the underlying language (Latin) is how you interpret letter V.

4. Every detail is important. For example there are two EVA-d letters with different meanings, EVA-ch is different from its angled counterpart and so on. By default one should assume that there are no scribal errors.

5. The overall impression is that the manuscript is of satirical nature. Author is playing with language using different styles (of different ages and areas?) and inventing new word forms extensively using suffixation. There is plenty of room for that: Whitaker's words gives about 130 suffixes. Even if we remove similar ones (like itat - etat), there are still too many. And what about prefixation?

6. It's to early to reason about content, I was wrong so many times (ie always), that it's senseless. But still it seems that for example You are not allowed to view links. Register or Login to view. is about man writing comoedia and rewriting end for that, You are not allowed to view links. Register or Login to view. is about writing while sun is rising.

7. And pages with nymphs... satirical content, language tricks, but nymphs?? The question about images is of no importance for me but anyway, why nymphs??? And here is some crazy recent hypothesis to be tested. The expression like "join endings" can mean both attach ending to stem (of word) and connect two endings with (some tube). The word acuminis (trick) and aquamanus (basin) both match to EVA-qoty (in my interpretation). So generally if one Voynich word is matched to several words/notions of an underlying language then there is possibility to misinterpret it. And in this part the author of VMS consciously builds the text so that the misinterpretation leads us nymphs, aquaeducts and so on...

8. Although it's not critical, but it seems that the whole VMS was written by the same person, just the style changed with time.

9. Another core question is pronunciation. It's known that o was sometimes pronounces as u and u as o, an so on. Knowing the precise rules for pronunciation would help dramatically, but they may be bit complicated. If we consider all options and make search broader there are too many options. If we make assumptions and narrow search then there is good possibility to miss something important. So somehow both strategies should be used.

The next bulletin, with the code became bit more simple and stricter.

Latest version of the work. Some fresh details in code, bit better translations, but as always nothing really new Big Grin

New update to the theory. A bit more a bit later.

This is "high-level" report as opposed to "low-level" description in previous post.

1. The current hypothesis is that VMS is satirical book, with several literary devices used.

2. Latin language is quite friendly to abstract suffixes (for example aegrum > aegritūdō, aegrōtās, pigrum > pigritia, pigritās) and neat idea is to apply them freely to all words. Given also diminutives one can obtain a lot of funny neologisms with different beginnings but very typical endings.

3. Another feature is using figurative meanings of words (as in sale nigro). This obviously doesn't favor decoding process by adding new possibilities and producing some weird word combinations, but when the precise word meaning is known there is a little extra ambiguity.

4. The change in language of the VMS can be partly explained by restrictions put on suffixes used. Also suffixes themselves are similar (for example -iti- and -itāt-) and there are some interesting common points with other theories.

5. It's worth noting the number of rare elements on page You are not allowed to view links. Register or Login to view. - an-endings, o and aiin with "acute", sh with o, cto, and especially ainy-ending of the first word. It seems that in the beginning of the work author had concerns about ambiguity of his writing method and added "hints" for the reader. Later pages are more steady and by bathing section are becoming almost monotonous.

6. Also interesting to note that there is number of images where dark leaves are alternating with light ones. This may symbolize alternating in style (dark for biting, light for favorable).

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14