The Voynich Ninja - Need advice for testing of hypotheses related to the self-citation method

Pages: 1 2 3 4 5 6 7 8 9 10

(02-07-2025, 08:49 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.An evolution rule like "ey." → "eo." is unlikely, it would probably have produced "cheeo" from "cheey" earlier.

Hi, there, if you know Serbian, you would know that -eo or -uo is a suffix for the masculine singular verbs, past or future participle. And if you look closely, you would see that the EVA ee is written like Latin u, with full rounded connecting line at the bottom. This explains eeee as double uu, or eee as eu or ue diphthong. Changing ee to u would turn the word cheey to chuy (čuj - hear!) and cheeo to chuo (čuo). ČUO SAM is Serbian for I herd. The evolution of this word is conjugational (čuj, čuo) but also dialectal. In Slovenian, the word čuj is used for the 2nd person singular imperative, but for the 3rd person singular, the word ČUL is used, compared to Serbian čuo. This conjugational form was also often used in the Slovenian dialect that retained most Old Church Slavonic words.
CHUO is also frequently used as part of the word, like cheeody (čuodi - miracles), cheeos (čuos - time). The word chey (čej) that differs for one letter only, has a different meaning.
A lot of Voynich words that differ only for a letter belong to the same word family. The number of word families in the VM is quite low, but the number of the derivatives, inflectional forms makes up for extremely high frequency of words that occur only once.

A bit more on the initialisation problem as I see it.
Here, I am just talking about the very start of the text.

We must keep in mind that the text uses a new alphabet, and its words exhibit a relatively strict pattern.

Just starting with two arbitrary words (strings using some of the symbols in this alphabet) won't do it.
These would use only a subset of the character set, so the additional (new) characters would be introduced in an intial procedure. This should still be seen as part of the initialisation. This process is complete once all of the 'normal' characters are represented in the text.
Here I don't even want to make an issue of the more or less rare characters like x v b .

How large would this initial group have to be? Would this group already exhibit the word pattern?
This means that the word pattern would be included both in the initial words and in the rules for creating new ones. This would make it something deliberate.

Would it already include the bigram ed ?

If the MS really starts with f1r, then it would not.
What would the initialisation look like if You are not allowed to view links. Register or Login to view. is indeed the start?

Does any page in the MS exhibit a text that more strongly suggests such a start?

(03-07-2025, 01:26 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.1. That does not work for the very beginning, which is my real interest

You have to start from something, so an initial set of "seed" words that come from nowhere is unavoidable.

Quote:1a. It is still open whether there should be only a single initialisation or one per page or one per paragraph

Since glaringly obvious statistical gaps exist at the paragraph, page, section level, they are a feature of Voynichese that should be replicated by a good pseudo-Voynichese generator. This could be achieved by creating evolutionary bottlenecks by initialization and then applying evolution rules that don't allow to generate any word from any word with enough iterations.

There is also the possibility that new evolution rules or seed words were added late in the process to explain the apparition of new patterns such as "ed", "lk-", "ll-".

I don't know or care what Torsten Timm's app does.

(02-07-2025, 09:35 AM)Aga Tentakulus Wrote: You are not allowed to view links. Register or Login to view.But that's exactly what it does.
It follows the system. ey, eo, ty usw.

(03-07-2025, 03:55 AM)cvetkakocj@rogers.com Wrote: You are not allowed to view links. Register or Login to view.Hi, there, if you know Serbian, you would know that -eo or -uo is a suffix for the masculine singular verbs, past or future participle.

Hi Aga, Cvetka,

This thread isn't about your theory, or any natural language theory. Who would have guessed? Big Grin

(03-07-2025, 08:59 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.A bit more on the initialisation problem as I see it.
Here, I am just talking about the very start of the text.

We must keep in mind that the text uses a new alphabet, and its words exhibit a relatively strict pattern.

Just starting with two arbitrary words (strings using some of the symbols in this alphabet) won't do it.
These would use only a subset of the character set, so the additional (new) characters would be introduced in an intial procedure. This should still be seen as part of the initialisation. This process is complete once all of the 'normal' characters are represented in the text.
Here I don't even want to make an issue of the more or less rare characters like x v b .

How large would this initial group have to be? Would this group already exhibit the word pattern?
This means that the word pattern would be included both in the initial words and in the rules for creating new ones. This would make it something deliberate.

A page of the VM doesn't have to be the first page of Voynichese ever produced. Several pages may have been produced in parallel from sources on pages that were not necessarily kept in the final product.

The first page ever produced probably didn't look good and was discarded: not enough diversity in first generation words, not enough rules to evolve enough different words. In the beginning they must have tested the method on paper and not directly on vellum, until it stabilized enough to be confident that they could use it for the generation of a long manuscript. Maybe they added q after You are not allowed to view links. Register or Login to view. and made a few adjustments to the rules. It's impossible to guess the exact process anyway.

(03-07-2025, 08:59 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.A bit more on the initialisation problem as I see it.
Here, I am just talking about the very start of the text.

We must keep in mind that the text uses a new alphabet, and its words exhibit a relatively strict pattern.

Just starting with two arbitrary words (strings using some of the symbols in this alphabet) won't do it.
These would use only a subset of the character set, so the additional (new) characters would be introduced in an intial procedure. This should still be seen as part of the initialisation. This process is complete once all of the 'normal' characters are represented in the text.
Here I don't even want to make an issue of the more or less rare characters like x v b .

How large would this initial group have to be? Would this group already exhibit the word pattern?
This means that the word pattern would be included both in the initial words and in the rules for creating new ones. This would make it something deliberate.

Would it already include the bigram ed ?

If the MS really starts with f1r, then it would not.
What would the initialisation look like if You are not allowed to view links. Register or Login to view. is indeed the start?

Does any page in the MS exhibit a text that more strongly suggests such a start?

I don't think there is an 'initialisation problem' at all. What ultimately determines the statistics of the copied&modified text are the modification rules, not the seed string: its effects peters out soon. Ie. if you have a 'modification rule' which adds the prefix 'ok' 50% of the times a prefix is added, 'qok' 25% of the times and 'cho' for the remaining 25% you'll end up with qok/ok/cho prefixes in those proportions whatever initialization string you started from (even starting from a null string).

Mauro: I like your idea of starting from an empty string. Neat.

As a Java/JavaScript programmer I object to calling it a null string. Confused

I suppose you mean an empty string.

seeds = [""];

I think that the modification rules should be harvested from the VM by a good program that can maximize their probability, not guessed. This should come after the study of source selection patterns, because having at least some credible source-target pairs will help a lot in harvesting the rules.

There is a lot to study and experiment.

(03-07-2025, 09:17 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.You have to start from something, so an initial set of "seed" words that come from nowhere is unavoidable.

Yes, and this is another weakness of the whole approach, which I am just trying to point out. I don't want to start a religion discussion, but it is a good parallel. Modern languages evolved from ancient ones (probably with steps and certainly with 'inter-breeding') much like humans evolved from other species.

Is Voynichese more like the biblical approach where two persons were 'created from scratch' and the rest evolved from these two? Also here, two is not an adequate
number, and I believe the exact family tree has not been described. But my knowledge here is limited, and the example does not need to be followed in more detail.

(03-07-2025, 09:17 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.I don't know or care what Torsten Timm's app does.

I do agree with that to some extent. There is a nearly infinite number of ways in which a self-citation model could have been used. This is just one example.
However, its importance is that this particular implementation has been used as 'evidence' that this should be the way in which the MS text was created.

Anyway, I am interested to see more independent checks of the success rate of this method, so look forward to what you find.

(03-07-2025, 10:13 AM)nablator Wrote: You are not allowed to view links. Register or Login to view.A page of the VM doesn't have to be the first page of Voynichese ever produced. Several pages may have been produced in parallel from sources on pages that were not necessarily kept in the final product.

Again, you are adding complications which Torsten never included in his descriptions.
Does this mean that you have already seen that the basic approach does not work?

This is not meant as an ironic/sarcastic remark. It is an honest question.

(03-07-2025, 05:08 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view.Ie. if you have a 'modification rule' which adds the prefix 'ok' 50% of the times a prefix is added, 'qok' 25% of the times and 'cho' for the remaining 25% you'll end up with qok/ok/cho prefixes in those proportions whatever initialization string you started from (even starting from a null string).

Here, you initialise with a set of 'word chunks'. This is different, and interesting, but it remains an initialisation procedure.

We still have to end up with a closed alphabet. I think that we may safely assume that this was defined before the text writing (including any initialisation of the word modification procedure).

After having defined the alphabet, what would be the next step?

The problem is now moved to finding the correct set of word chunks that would do it.
I am not recommending really trying to do this, but thinking about it more may already point out the great difficulties this will cause.

Let's look at it in yet another way.

I assume people are familiar with the word connectivity trees that Torsten has generated.
You are not allowed to view links. Register or Login to view..

This tree could very well be the result of a self-citation method.
Now let's look at the initialisation.

Are the 'seed' words among the words in this graph?
How many? Where are they?

One extreme would be just one or two.

Torsten's own generator, which should correspond with his best guess, was between five and ten IIRC?

Should they be at the centres around the most frequent words?

Another extreme is that many or most of the words were set up using self-citation changes, before the writing was even started.

In my considered opinion - and I have always been outspoken about this - the self-citation method is not a resonable 'solution' of the Voynich MS.

The argument that the initialisation is not a problem, it was just done 'somehow', for me is not satisfactory.
It may not be the biggest problem (because I think that the whole thing cannot work), but it is being consistently swept under the carpet ;-)

Pages: 1 2 3 4 5 6 7 8 9 10