The Voynich Ninja
labels as words - Printable Version

+- The Voynich Ninja (https://www.voynich.ninja)
+-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html)
+--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html)
+--- Thread: labels as words (/thread-738.html)

Pages: 1 2 3 4 5 6


labels as words - MarcoP - 11-09-2016

In You are not allowed to view links. Register or Login to view. Rene wrote:

(09-09-2016, 10:22 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.Going back many years, there used to be a statement about the word spaces along the following argument:

given that the label words should be individual words, and they primarily tend to appear in the main text separated by spaces, it seems that the spaces are real.

The problem is that I have never seen anyone really demonstrating this.
This would be a bit of work, but not too difficult to do and it might really tell us something.

All labels can be matched with a version of the main text from which all spaces have been removed.

One can then see how many labels are not found at all, and for the remainder whether there is indeed a preference for them to reoccur separated by spaces.

Depending on the result, the situation could of course not be entirely clear.


I tried making this count, as suggested by Rene.
I used Takeshi Takahashi's transcription.


504 different labels consisting of a single word of length 4 or more

235 labels are perfectly matched by words
examples:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.

21 labels perfectly match a sequence of words (i.e. they appear in the text split into multiple consecutive words; the beginning and end of the label still correspond to spaces)
examples:
otaraldy .otar.aldy. You are not allowed to view links. Register or Login to view.
okeeodal .okeeo.dal. You are not allowed to view links. Register or Login to view.
dolol .dol.ol. You are not allowed to view links. Register or Login to view.

37 labels occur as part of longer words
examples:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.

39 labels are found removing all spaces (line and page breaks included)
examples:
yoraly sheedy.oraly You are not allowed to view links. Register or Login to view.
tsholdy dtshol.dytal You are not allowed to view links. Register or Login to view.
chokaro chokar.okcho You are not allowed to view links. Register or Login to view.

172 labels cannot be found
examples:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.


47% of the labels simply and directly occur as words.

Labels matching a sequence of words (4%) and labels matching part of words (7%) add up to an 11% of dubious cases, that could be explained both linguistically and by the arbitrary manipulation of spaces between words.

8% of the labels match the text only if spaces are completely ignored.

34% of the labels cannot be found in any case.


47% to 8% seems to me a clear preference for labels to reoccur separated by spaces. I attach lists of the labels split into the different categories discussed above.


I hope that someone else will independently make a similar exercise or at least check my results, so that we can be sure I did not accidentally introduce any major error.

As always, many thanks to Job for Voynichese.com Smile It really makes checking and discussing these things easier!


RE: labels as words - Koen G - 11-09-2016

Thanks, Marco, it's good to get some data on this matter. Intuitively I'd say this distribution is one we could expect if there was no artificial manipulation of spaces.

The fact that 172 labels cannot be found is something I find very interesting, and in a way reassuring that the labels are actually meaningful. Add to that the 8% of the "all spaces ignored group", and that's 42% label-only words. That's more than I thought.


RE: labels as words - Emma May Smith - 11-09-2016

I think this is good work. I'm happy that labels aren't as divorced from the text as sometimes stated.


RE: labels as words - ReneZ - 11-09-2016

Many thanks Marco,

this is indeed a very helpful result.
If 172 labels are not found back in the text, then 332 are. Of these, 70% are found back as a single word separated by spaces, and this is indeed a very high percentage, suggesting that the spaces as we see them in the main text are intentional.

This isn't water-tight proof of course, but very strong evidence.

One has to keep in mind that the list of labels in the MS do not follow Zipf's law at all. While the label words aren't exactly unique, they are "almost unique". Among the labels, there are only few repetitions.

The label words *could*  be repetitions of words in the main text, after the text was written with arbitrary space insertion, but then one should not expect such a flat word frequency distribution in the labels.


RE: labels as words - Anton - 11-09-2016

What I would add to this is quite high density of labels in the folio space. Job's tool has a special request to "plot" labels.


RE: labels as words - MarcoP - 11-09-2016

(11-09-2016, 12:58 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.What I would add to this is quite high density of labels in the folio space. Job's tool has a special request to "plot" labels.

Oh my god, thank you Anton!!!
You are not allowed to view links. Register or Login to view.


RE: labels as words - Anton - 11-09-2016

This should be used with caution though, because the approach to "label" is quite formal - e.g. all You are not allowed to view links. Register or Login to view. stuff is also considered "labels". But, nonetheless, the density observed suggests highly conspected nature of the text.


RE: labels as words - MarcoP - 11-09-2016

(11-09-2016, 01:08 PM)Anton Wrote: You are not allowed to view links. Register or Login to view.This should be used with caution though, because the approach to "label" is quite formal - e.g. all You are not allowed to view links. Register or Login to view. stuff is also considered "labels". But, nonetheless, the density observed suggests highly conspected nature of the text.

You are not allowed to view links. Register or Login to view.

I understand that the words in green are labels and those in blue are "words matching labels". Since You are not allowed to view links. Register or Login to view. features many single letter words which also appear as labels in You are not allowed to view links. Register or Login to view., the page is mostly blue. This seems sensible to me.


RE: labels as words - Koen G - 11-09-2016

Marco, I wonder if there would be a difference in these statistics if the labels are considered section by section. Is that something you can check easily?


RE: labels as words - MarcoP - 11-09-2016

(11-09-2016, 01:55 PM)Koen Gh. Wrote: You are not allowed to view links. Register or Login to view.Marco, I wonder if there would be a difference in these statistics if the labels are considered section by section. Is that something you can check easily?

I was thinking of something simpler: collecting matches in the different sections for the 235 labels that exactly match words.
Then we could have a "scatter matrix" for where labels re-occur in the text. Anyway, this requires some work and I am not sure of when I will have time to do this.