DONJCH > 17-11-2019, 03:36 PM
(17-11-2019, 02:21 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.While I understand the usefulness of percentages in removing the spurious correlation induced by varying page-lengths, I second Torsten's question. I am also interested in understanding what can be the meaning of overall correlation being greater than that of the individual sections.
MarcoP > 17-11-2019, 04:31 PM
(17-11-2019, 03:25 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.(17-11-2019, 02:21 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.The plot I get for chedy / shedy seems to me reasonably close (though not identical) to the one computed by Rene on the basis of Nablator's spreadsheet. I have included {0,0} pages. The red line is y=x.
I think that nablator's spreadsheet was based on the Takeshi Takahashi transcription. If you used another one, the difference would be explained.
nablator > 17-11-2019, 07:05 PM
(17-11-2019, 04:31 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.I am sorry, I forgot to mention that I used Takahashi's transcription. There must be some other reason for the differences.Maybe if you used regex \b to search for words boundaries, it assumes that '?' is a word separator.
ReneZ > 17-11-2019, 08:50 PM
(17-11-2019, 03:36 PM)DONJCH Wrote: You are not allowed to view links. Register or Login to view.Torsten is correct in stating that the calculation of R assumes that the data is normally distributed.
Torsten > 17-11-2019, 09:52 PM
(17-11-2019, 08:50 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.
(17-11-2019, 03:36 PM)DONJCH Wrote: You are not allowed to view links. Register or Login to view.Torsten is correct in stating that the calculation of R assumes that the data is normally distributed.
farmerjohn > 17-11-2019, 09:58 PM
(17-11-2019, 03:36 PM)DONJCH Wrote: You are not allowed to view links. Register or Login to view.(17-11-2019, 02:21 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.While I understand the usefulness of percentages in removing the spurious correlation induced by varying page-lengths, I second Torsten's question. I am also interested in understanding what can be the meaning of overall correlation being greater than that of the individual sections.
The reason overall correlation is greater is that the line is defined mostly by the extreme points from the biological section.
Torsten is correct in stating that the calculation of R assumes that the data is normally distributed. Of course that rarely happens which is one reason why this is not a good statistic to calculate in these very non-normal circumstances.
MarcoP > 18-11-2019, 10:32 AM
(17-11-2019, 09:58 PM)farmerjohn Wrote: You are not allowed to view links. Register or Login to view.The important thing however is that the correlation value itself says a little. Is value 0.5 high or low? To deal with it one must also calculate correlation between other types of words. And if one thinks that EVA-sh and EVA-ch are the same based on correlation values, he must be very cunning when explaining almost equally high correlation coefficient for EVA-ok and EVA-qok
(14-11-2019, 03:39 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.(In my bowl apples and pears are both fruit. chedy= shedy)
(15-11-2019, 12:48 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.If we estimate the level of experience and intelligence of the author of the VMS to an equally Medieval text, you will find that such written text, at least in the occult genre, are very inconsistent and much words are used inconsistently.
You will very frequently find in the same text, that every possible variation is used, color , colour , kolor, collor, kolore. and such.
(16-11-2019, 06:51 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.The first thing that comes to mind is that this is an indication of arbitrariness. Like pulling words arbitrarily out of a hat. Or, as if the difference between ch and Sh is meaningless. This way of thinking causes other problems, e.g. in the area of entropy, and one would be pushed into the direction that Voynich words are not complete words, but verbose renditions of something smaller.
'Verbose' of course implies that there is a meaningless component in the text, but it is not all meaningless.
The difference between ch and Sh could also be 'trivial' rather than meaningless. What do I mean with that?
If the text is an encoding of some plain text, then the plain text was of course a handwritten text. The curl on top of the Sh could be a representation of a serif. Just one more idea....
ReneZ > 18-11-2019, 03:54 PM
Torsten > 18-11-2019, 07:49 PM
(18-11-2019, 03:54 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.This shows that there is a general trend, but still significant variability between the different sections.