![]() |
|
Working my way to a semantic word analysis - Printable Version +- The Voynich Ninja (https://www.voynich.ninja) +-- Forum: Voynich Research (https://www.voynich.ninja/forum-27.html) +--- Forum: Analysis of the text (https://www.voynich.ninja/forum-41.html) +--- Thread: Working my way to a semantic word analysis (/thread-5128.html) Pages:
1
2
|
RE: Working my way to a semantic word analysis - ReneZ - 13-12-2025 (12-12-2025, 01:51 PM)mxv456 Wrote: You are not allowed to view links. Register or Login to view.The whole analysis is based on IT2a-n.txt from You are not allowed to view links. Register or Login to view. Is this the correct choice? As far as I understand, it's a version of the TT transcription, but I don't know what the state of the art is. A more recent, more accurate and more complete transliteration is this file: RF1b-er.txt It is in the same format so you should in principle be able to repeat the same analysis by just swapping the file. At the highest level, the two are very similar, so I would not expect to see any significant difference. At the same time, this gives a good indication of the error (or uncertainty) in the input data. Some statistics are far more sensitive to these changes/errors, for example those based on word counts. RE: Working my way to a semantic word analysis - MarcoP - 13-12-2025 (12-12-2025, 01:51 PM)mxv456 Wrote: You are not allowed to view links. Register or Login to view.What did surprise me is that the -edy ending is also among the most common endings in the Currier A script. From what I read and saw, I assumed that it is almost exclusive to the Currier B. Does that mean that I (a) simply misunderstood or (b) chose the wrong page split between currier A and B? Hi Marvin, if I understand correctly, you are treating the last section of the manuscript (the starred paragraphs Quire 20, f103-116) as Currier A, while it should be Currier B. I guess this could be the reason of your high counts for -edy in Currier A? RE: Working my way to a semantic word analysis - mxv456 - 13-12-2025 Thanks everybody! I didn't actually expect to get any response within 24h, I'm amazed how active the forum is! And thanks for all the input. Turns out I did have the Currier separation wrong. I'm super happy I posted the base stats first before basing the rest of the analysis on this data. So below are the recomputed plots with - RF1b-er transcription - Currier AB indication directly from the transcript Thanks for all the hints! It does change the plots meaningfully and resolves some of my confusions from the initial post. 1. The word length plots largely stay the same. ![]() 2. The Zipf distribution actually changes quite a bit! Currier B is now much more clearly an outlier. ![]() 3. The bigram heatmap also visibly changes. Mainly the pairs "ed" and "dy", but the two patterns now look less similar. However, still much more similar than the reference languages to each other. ![]() 4. Word end trigram: This is the big one, the edy ending is now virtually exclusive to the Currier B. So yes, I had a pretty big mismatch. Great that you helped me to find it early! ![]() I fixed the changes on the You are not allowed to view links. Register or Login to view. but I'll leave them in the initial post. (I added a hint that they are incorrect.) This does change the semantic analysis a bit, I'm doing my best to make the results visually intuitive to understand. I'll post them in a new thread when I'm ready. |