The Voynich Ninja
The Textual Work of August Walla, mentally disabled artist - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: The Textual Work of August Walla, mentally disabled artist (/thread-3123.html)

Pages: 1 2 3 4 5


RE: The Textual Work of August Walla, mentally disabled artist - Koen G - 04-06-2020

If I read the graphs correctly, I would say they are similar in the sense that they both deviate from the standard. But they do so in different ways. Wallah's amount of reduplication makes Voynichese look normal. And Voynichese's amount of quasi-reduplication makes Wallah look normal. They stretch different axes.

Although if you were to add both values, they would probably both deviate in similar ways. But then we'd know that different phenomena underlie the similar values.


RE: The Textual Work of August Walla, mentally disabled artist - Ben Trovato - 07-06-2020

My thinking is that this makes both texts quite different. For Wallas use of reduplication, we can assume three reasons:

1) In some (proven) cases, it substitutes the comparative degree. This happens frequently in many languages, where "big big" simply means "bigger". 

2) On a psychological level, repetition is a technique to ease tension. So Walla, when agitated, can use this in his art to cope with his mental disorder.

3) As Walla sees his work (at least in part) as spiritual practice, repetition here is the basis of ritualization. This is how chants, incantations, prayers etc. are being made.

For (2) and (3), this is of course a brutally shortened summary that should (and could) be explained much more carefully. But my point is: all these goals are met better when using reduplication, than when using quasi-reduplication. So, unless we assume that the VMS author(s) were unable to perform perfect reduplication (and why should we assume that?), it seems that the underlying principle in text generation is really different here.

I'd love to your comments on this!


RE: The Textual Work of August Walla, mentally disabled artist - MarcoP - 08-06-2020

Thank you, Ben! Your observations about repetition in Walla's work are quite interesting. I think that, when the phenomenon is as frequent as in Walla, it likely has multiple causes, as you suggest. Quasi-reduplication in the VMS is fascinating. I believe that several cases are the result of a perfect reduplication that was altered by the kind of transformations that Emma discussed in her You are not allowed to view links. Register or Login to view., e.g. line initial reduplication X.X can become yX.X because the first token in a line is subject to specific transformations.

After Jonas pointed out reduplication in the Finnish text Kalevala, I discussed quasi-reduplication in the poem with a Finnish native speaker (You are not allowed to view links. Register or Login to view.). She (?) explained that consecutive sequences like "huitukoille haitukoille" or "soutelevat joutelevat" are the result of a process in which words are altered for expressive reasons.
Vilmiira Wrote:Finnish allows for making new words like this, especially for more descriptive words, like sounds (kilinä, kolina, kalina) , ways of walking (lompsia, lampsia, kampsia), a kind of person (köppänä, käppänä) etc. ... new words can be created pretty freely
In the end, this Kalevala Quasi-Reduplication is quite similar to your point (1): it has the function of stressing the meaning of a word.

As a further exploration of points (2) and (3), I would love to understand more of how glossolalia works. I have read somewhere that words are often reduplicated in that phenomenon, but I have been unable to find a written corpus for computing actual statistics. Overall, I see point (1) as mostly semantic, while points (2) and (3) are more suggestive of meaninglessness.

In the attached diagram, I plot the % of consecutive words in which the first (X axis) or last  (Y axis) characters are the same.
For instance, in this sequence the first two words contribute to the X axis measure and the last two words to the Y axis measure:
okchoy otchol chocthy oschy

Red circles are texts by Walla. Blue squares are Voynichese. Purple squares are from Brian Cham's collection of texts in various languages.
It seems that Walla's texts are somehow intermediate between Voynichese and other texts. In addition to the various EVA files, I added the Voynich v101 transliteration, which uses many more symbols: as expected, this reduces the number of prefix/suffix repetitions, but not enough to make the VMS look "normal".
I think it's interesting that the Kalevala (N-FIN) is about as extreme as the VMS in this respect: in both sources, about half of the consecutive words share at least the same initial or final character. The main difference is that the Kalevala favours identical consecutive prefixes, while the VMS favours identical consecutive suffixes. A few lines from the Finnish poem:

selviä sinä ikänä
ilman luien lonsumatta,
leukojen leveämättä,
hammasten hajoamatta,
kielen keikkelehtämättä.

This kind of alliteration is likely linked to what you say in point (3) about chants and incantations: it has the function of making words easier to memorize. Also, it conforms to some form of aesthetics. I feel quite sure that some kind of arbitrary aesthetics like this is at play in the VMS (as it is in Walla). The question is if there also is some meaningful grammar mixed with these patterns of if it is all just playing with sounds, or with glyphs as Timm and Schinner believe).


RE: The Textual Work of August Walla, mentally disabled artist - Alin_J - 08-06-2020

Here's a graph over the same data as presented in previous posts, but with the addition of Walla's German text. I have only included the Voynich manuscript (VM), Walla's original text and Walla's German text (Ger). All upper-case characters were converted to lower-case for Walla's texts.
Walla's german text has 675 unique words, and the total number of words is 1144, making the type-token ratio 59%. The Voynich manuscript has a ratio of 24% but it is a much longer text, and the re-use of the same words previously written is normally more frequent as text is running longer for text in any language over a certain limit (take for instance the ratio just a sentence in English, and it would probably be close to 100%). 
Anyway, the Voynich manuscript still differs from Walla's texts with its comparatively high percentage of repeated two-word sequences IMO (which number is more similar with those of other 'normal' texts in other languages). But similarly to the number of unique words, this could be an effect of the larger text size. Furthermore. it has very small percentage of sequences longer than that. The percentage of repeated triplets and quadruplets for Walla's German text are similar to the VM data (although the actual numbers of triplets and quadruplets are only 4 and 1, respectively, so there is not a very high confidence in these data for Walla's German text, which, once again, is related to the limited size the text).
   


RE: The Textual Work of August Walla, mentally disabled artist - Ben Trovato - 08-06-2020

Jonas @Alin_J, would it be possible to compare the Walla texts to a sample of VMS text of identical size, using either the different VMS sections or just a random clip? In my naive understanding, this could show whether the assumptions of how text length affects the statistical numbers are correct, or couldn't it?


RE: The Textual Work of August Walla, mentally disabled artist - Alin_J - 08-06-2020

(08-06-2020, 06:45 PM)Ben Trovato Wrote: You are not allowed to view links. Register or Login to view.Jonas @Alin_J, would it be possible to compare the Walla texts to a sample of VMS text of identical size, using either the different VMS sections or just a random clip? In my naive understanding, this could show whether the assumptions of how text length affects the statistical numbers are correct, or couldn't it?


Yes, it could. The hardest part is deciding what part or parts to select. I think it would be wise to chose a couple of selections at random first.


RE: The Textual Work of August Walla, mentally disabled artist - Alin_J - 19-06-2020

I have analysed a couple of different sections in the Voynich manuscript which had approximately the same number of tokens as Walla's first (invented words) text in this thread. I have actually analysed a much larger number of Voynich sections, but they all also showed quite similar statistics as these sections. Here is the result from the repeated n-tuple occurrences of these texts compared to Walla's first text and his German text, together with results from texts in natural languages from the Project Gutenberg website, all truncated to the first 1600 tokens. So, all of these texts are of comparable sizes measured in number of words (tokens). It allows a fairer comparison, both of repeated sequences, but mostly of type-token ratio (TTR). First shown is a new graph over % of repeated n-tuples. Furthest to the left are Walla's two texts, then follows four different sections of the Voynich manuscript, and then the truncated text sections in six different languages. The TTR can be seen in the next figure (table).

   
   

Now the Voynich sections look more similar to the excerpts of the natural language texts, while Walla's invented text looks like the odd one out, judging from the much higher incidents of repeated 3- to 6-tuples compared to repeated pairs. However all of the texts TTRs are quite similar. The English text is quite repetitive, but I think the differences here among languages only reflects differences in writing style or style of texts, as seen from earlier data. I think these results in other ways also in this regard points towards natural language-hypothesis for the Voynich manuscript, if one allows for the possibility that different sections of the manuscript is written in very different language styles/subjects matters/dialects, or by different authors or whatever, that makes them share lesser number of repeated phrases than is common in larger coherent texts.


RE: The Textual Work of August Walla, mentally disabled artist - Alin_J - 14-08-2020

I also studied the intra-word relationships in Walla's invented words using Principal Component Analysis. For each word in the vocabulary, the first character to the next to last's transition to the next-following character was counted (bigram frequencies of all the bigrams in the words, so that if a word is spelled "abc..." the first bigram of this word is "ab", the next "bc" and so on. These bigram frequencies of all different word-types were summed together and arranged in a matrix where a row stands for the first character in the bigram, and a column the second character. Next, all the matrix elements were normalized by dividing each matrix element with the frequency sum for the corresponding row, resulting in a transition-probability matrix that expresses the transition-probability given character i, to character j if i is the row index and j is the column index, normalized also as the sum-average of each word in the vocabulary. Only the characters which had frequencies above 0.5% of the total character count was included into the matrix. For Walla's text all upper-case characters were converted to lower-case before frequency counting and analysis. Principal components (PC) analysis on this matrix revealed the following score scatter plot:

   

The first character in the bigram (observation point) label is prefixed by "f-" and the second character in the bigram (original variable vectors, represented by green lines) labels are not prefixed at all, simply labeled by its character. Each principal component is a linear combination of all the original variable vectors and these vectors are hence also here summed for PC1 and PC2 and projected into the plots.

The plot showed, as is usual for natural languages, a pretty clear separation of vowels and consonants, where most of the vowels show up on one side along PC1 and the consonants along the opposite side of the 0 value of PC1. Also typical for natural languages is the similar division of the vowel/consonant original variable vectors (green lines) along opposite sides of the PC1 0 value, where the vowel-vectors are on the same side as the consonant observation points and vice versa. This is because natural language words often alternate between vowels and consonants. This is also present in Walla's words. There does not seem to be much additional groupings, and this also appears to be typical for natural languages. Sometimes however in natural languages there is a certain bigram that is very frequently used, for example "sh" in English, a digraph representing a single phoneme. These types of digraphs would often reveal themselves as outliers in a separate direction (along the variable line representing the second character in the bigram). 

The first (most significant) two components had the following eigenvalues and % of covariance explained:


PC Eigenvalue % variance
1  0.0251165  41.497
2  0.00849664 14.038


The % variance of the first PC is usually much larger than the second (and all the rest) in natural languages, because in natural languages PC1 is mostly governed by the vowel-consonant alternation which results in much larger covariance than what any other types of common grammar rules/frequent syllable characteristic, etc would result in, even taking language's frequent uses of certain digraphs into account. So, these values would also be quite typical for natural languages.

The following two plots and tables show the corresponding results for VMS languages A and B in respective order. Note that this was made using the 101 transliteration. Note also that in this case the observation point labels are prefixed with "_" (not "f-").

   

VMS Language A


PC Eigenvalue % variance
1  0.0475643  38.251
2  0.0274967  22.113


   

VMS Language B


PC Eigenvalue % variance
1  0.0477088  29.021
2  0.035866   21.817


For the VMS the separation along PC1 is not as dominant as separation along PC2, particularly for language A, three pretty distinct directions are evident, one along the lower left quadrant, one in the upper part along PC2 and one more along PC1. PC1 is not as dominant as PC2 which is shown by their % variance values which do not decrease as much going from PC1 to PC2.

The possibilities for more in-depth analysis and discussions are many, but for now I don't have anything more conclusive than this. But in answer to the original question in this thread, if any analysis results of the same analyses on VMS as on Walla's text would be similar, for this analysis then they do not seem to be very similar in that Walla's text behave more like natural language than the VMS do.


RE: The Textual Work of August Walla, mentally disabled artist - Davidsch - 09-09-2020

You seem to forget that if the ms contains blessing formulas: one for the living and one for the deceased, in acronym form,
there is no resulting sensible text to be found and this quest is an eternal movement, to be repeated over and over...


RE: The Textual Work of August Walla, mentally disabled artist - Alin_J - 09-09-2020

(09-09-2020, 04:27 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.You seem to forget that if the ms contains blessing formulas: one for the living and one for the deceased, in acronym form,
there is no resulting sensible text to be found and this quest is an eternal movement, to be repeated over and over...

Well, for that I have no argument... but for what it's worth, I sometimes feel the following quote uplifting:

“Whatever you do will be insignificant, but it is very important that you do it.” - M. Gandhi