Basically the labels don't obey Zipf's law. Most of them only occur once. But that's actually normal since lists and other kinds of data are known not to obey Zipf's law.
If you think about the labels in a visual dictionary, for instance, nearly all of them are only going to occur once. I would guess that labels accompanying illustrations in scientific textbooks would have a similar distribution.
That the main text obeys Zipf's law and the labels do not is not going to be easy to explain if you think it's something other than a meaningful text. Maybe someone in the early 15th century had already discovered Zipf's law and consciously created a nonsense text to emulate these properties?
(25-02-2017, 09:21 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.That the main text obeys Zipf's law and the labels do not is not going to be easy to explain if you think it's something other than a meaningful text. Maybe someone in the early 15th century had already discovered Zipf's law and consciously created a nonsense text to emulate these properties?
Actually, the Zipf's law does not prove meaningful text at all. Please have a look at this paper: You are not allowed to view links.
Register or
Login to view.
(25-02-2017, 02:15 PM)Anton Wrote: You are not allowed to view links. Register or Login to view. (25-02-2017, 09:21 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.That the main text obeys Zipf's law and the labels do not is not going to be easy to explain if you think it's something other than a meaningful text. Maybe someone in the early 15th century had already discovered Zipf's law and consciously created a nonsense text to emulate these properties?
Actually, the Zipf's law does not prove meaningful text at all. Please have a look at this paper: You are not allowed to view links. Register or Login to view.
Please read my statement again. If a random process that generates texts obeying Zipf's law was used to generate the VMS text, then why don't the labels
also obey Zipf's law?
You could say that a different process was used for the labels. But why would someone do that? Was someone in the 15th century aware of Zipf's law and where it should and should not apply in meaningful texts?
(25-02-2017, 02:15 PM)Anton Wrote: You are not allowed to view links. Register or Login to view. (25-02-2017, 09:21 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.That the main text obeys Zipf's law and the labels do not is not going to be easy to explain if you think it's something other than a meaningful text. Maybe someone in the early 15th century had already discovered Zipf's law and consciously created a nonsense text to emulate these properties?
Actually, the Zipf's law does not prove meaningful text at all. Please have a look at this paper: You are not allowed to view links. Register or Login to view.
Indeed, Zipf's law describes the distribution of numbers measuring a complex system. For instances firm sizes fit to Zipf's law [see You are not allowed to view links.
Register or
Login to view.].
What Zipf's law tells us is that in some way the text generation mechanism for the labels is less complex. The observation that more unique and rare words are used as labels allow a similar conclusion [see You are not allowed to view links.
Register or
Login to view.].
(25-02-2017, 02:50 PM)Sam G Wrote: You are not allowed to view links. Register or Login to view. (25-02-2017, 02:15 PM)Anton Wrote: You are not allowed to view links. Register or Login to view. (25-02-2017, 09:21 AM)Sam G Wrote: You are not allowed to view links. Register or Login to view.That the main text obeys Zipf's law and the labels do not is not going to be easy to explain if you think it's something other than a meaningful text. Maybe someone in the early 15th century had already discovered Zipf's law and consciously created a nonsense text to emulate these properties?
Actually, the Zipf's law does not prove meaningful text at all. Please have a look at this paper: You are not allowed to view links. Register or Login to view.
Please read my statement again. If a random process that generates texts obeying Zipf's law was used to generate the VMS text, then why don't the labels also obey Zipf's law?
I'm not saying that a random process was used to generate the VMS text, I'm saying that the fact that the text exhibits behavior according to Zipf's law, that says nothing about whether it is meaningful or meaningless. The fact that the labels do not obey Zipf's law does not add neither. Why whould they, to begin with?
By the way, is there any reference to the labels freq vs rank? And what labels were taken into account - only those which accompany images or all standalone vords and characters?
The interesting part is not that the main text follows Zipf law, but that the main text does while the labels do not.
This means that not the same process was followed for generating or writing the main text and for the labels.
Or that they are not the same kind of thing.
I split the discussion about the Zipf law into a separate thread.
(25-02-2017, 04:11 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The interesting part is not that the main text follows Zipf law, but that the main text does while the labels do not.
This means that not the same process was followed for generating or writing the main text and for the labels.
I don't think that this tells us much about the process. A simple example provides one of the possible explanations of the behaviour.
1) I depict various objects
2) I label all those objects with arbitrary meaningless words ("descriptions"). Since all objects are different, descriptions will also be different.
Note 1. At this point, all labels will have the equal frequency count - of the value of 1.
3) I mention the depicted objects in the text (providing "explanations", for example). Each object is mentioned once.
Note 2. At this point, all labels will still have the equal frequency count - of the value of 2.
So there's no Zipf's curve, but there's no meaning at the same time.
(25-02-2017, 04:11 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The interesting part is not that the main text follows Zipf law, but that the main text does while the labels do not.
This means that not the same process was followed for generating or writing the main text and for the labels.
This is simply not correct.
What we see in the VMS is actually perfectly normal for a meaningful text. If you have a book containing a main body of text along with illustrations that contain labels, then:
1) The words in the main text will obey Zipf's law
2) The words in the labels will
not obey Zipf's law
Which of course is what we see in the VMS.
So from the standpoint that the VMS is a meaningful text, there is nothing unusual to account for at all, and certainly no need to postulate a separate process for creating the labels.
Actually, proposing that the labels were created using a separate process is problematic, because it would seem to require that whoever created the text understood that the frequency distributions of words should be different in the main text and in the labels, and that the creator consciously sought to emulate these properties. Either that or it would have to be some extremely unlikely coincidence that two processes were used and that they just so happened to reproduce these features.
This is in addition to the point that the words in the labels have the same structure as words in the main text, and some words occur in both the labels and the main text, etc. All the available evidence indicates that the words in the labels and the main text are part of the same language or system.
(25-02-2017, 04:11 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The interesting part is not that the main text follows Zipf law, but that the main text does while the labels do not.
This means that not the same process was followed for generating or writing the main text and for the labels.
The words used as labels are not alien to the VMS. The majority of words used as labels also occur within the main text. See for instance the three instances of [otalar]. It is used two times in the main text and it used once as label. But if the difference is in the statics there is also the possibility that the same generating process behave differently for labels.
From the text we know that line initial words and line final words behave differently from the text. We also know that the first line in a paragraph contains more gallow glyphs. Therefore I didn't found it surprising that the VMS behave differently for words used in a different context. This is especially true on pages where the labels are arranged in circular form.
It is no problem to find an explanation for the different statistics for labels. If the text contains language it is no surprise if names used as labels behave differently then the normal text. If the text was generated by using a cipher a thinkable explanation is that the labels are meaningless whereas the main text encodes information.
If the text was generated by the auto-copying method a possible explanation is that the words written in circular form were harder to copy. Therefore the scribe probably used these groups less frequently as a source for the generation of new text [Timm 2014 p. 17].
It is possible to backup this argumentation with some observations. An interesting observation is that some identical labels are used in the astronomical and in the herbal section. These labels are [okary], [oky], [otalam], [okeoly], [otaly], [otoky], [otaldy], [otal], [otal], [ykeody], [okeody], [okeos], [otory], [okody] and [oran]. One would at least not expect several stars or star constellations to be named after plants or parts of plants. Moreover, some similarly spelled labels occur together in both sections [Timm 2014 p. 9].
It is interesting that on some pages it seems that some consecutive labels are generated by replacing glyphs with similarly shaped ones. Page <f70v2> provides an example of this:
<f70v2.S1.1> otaral
<f70v2.S1.2> otalar
<f70v2.S1.3> otalam
<f70v2.S1.4> dolaram
<f70v2.S1.5> okaram
<f70v2.S1.8> okaldal
There is another interesting observation for page <f70v2>. In line <f70v2.R3.1> a word [oteotey] and a word [oteoteotsho] occur. The repetition of [ote] within both words is probably not a coincidence since the first word occurs only four times and the second word is unique. Furthermore, on page <f71r> a similar duplication pattern occurred using [oke]. In this case, the unique words [okeoky] and [okeokeokeody] were used. These observations raise the question as to whether the generating mechanism for labels is less complex and therefore easier to determine [Timm 2014 p. 9].