Torsten > 08-07-2021, 09:52 PM
Quote:This article presents the results of investigations using topic modeling of the Voynich Manuscript (Beinecke MS408). Topic modeling is a set of computational methods which are used to identify clusters of subjects within text. We use latent dirichlet allocation, latent semantic analysis, and nonnegative matrix factorization to cluster Voynich pages into `topics'. We then compare the topics derived from the computational models to clusters derived from the Voynich illustrations and from paleographic analysis. We find that computationally derived clusters match closely to a conjunction of scribe and subject matter (as per the illustrations), providing further evidence that the Voynich Manuscript contains meaningful text.[/font]
Torsten > 08-07-2021, 10:32 PM
MarcoP > 09-07-2021, 04:50 PM
Quote:On the basis of illustrations which accompany the text, it is customary to divide the manuscript into five sections. 1. botanical/herbal 2. astrological/astronomical 3. balneological 4. pharmaceutical 5. starred paragraphs/“recipes”
Quote:Although we cannot read the Voynich text, topic modeling is still applicable if we assume that Voynich words have a consistent form-meaning correspondence across the manuscript. That is, we need to assume that 8ain on You are not allowed to view links. Register or Login to view. is the SAME word as 8ain on f7v.
byatan > 12-07-2021, 06:08 PM
(09-07-2021, 04:50 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.BTW, also the fact that word structure is fairly consistent across sections while word frequencies vary so much is very puzzling.
Mark Knowles > 12-07-2021, 06:18 PM
(12-07-2021, 06:08 PM)byatan Wrote: You are not allowed to view links. Register or Login to view.(09-07-2021, 04:50 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.BTW, also the fact that word structure is fairly consistent across sections while word frequencies vary so much is very puzzling.
Assuming that the ms was created with the use of external aids like tables, wheels, lists, external text, or something else, this could be explained by some portion of these aids or their content having been changed or replaced at points during the creation for various reasons. Has there been much discussion that different sections could have been enciphered differently?
MarcoP > 12-07-2021, 06:42 PM
(12-07-2021, 06:08 PM)byatan Wrote: You are not allowed to view links. Register or Login to view.(09-07-2021, 04:50 PM)MarcoP Wrote: You are not allowed to view links. Register or Login to view.BTW, also the fact that word structure is fairly consistent across sections while word frequencies vary so much is very puzzling.
Assuming that the ms was created with the use of external aids like tables, wheels, lists, external text, or something else, this could be explained by some portion of these aids or their content having been changed or replaced at points during the creation for various reasons. Has there been much discussion that different sections could have been enciphered differently?
Rene Wrote:In the case of the original table-and-grille approach, a relatively easy explanation presents itself, namely that different tables were used for the pages in the two languages. However, this would not easily explain that almost all A-language words also tend to appear in the B language.
Torsten > 13-07-2021, 06:26 PM
Barbrey > 14-07-2021, 03:14 AM
MarcoP > 14-07-2021, 06:44 PM
(14-07-2021, 03:14 AM)Barbrey Wrote: You are not allowed to view links. Register or Login to view.Hi Torsten and Marco, thank you for illuminating some of the paper to people like me who are statistically challenged. And of course to the researchers: this paper took time and effort and is appreciated!
Can I ask what might seem an obviously answered question, because I frankly don't understand 90% of the study, and it's a question I had even prior to it. But I am not a linguist at all.
Is it possible that there actually are two dialects being encoded here? Latin, for instance, seems to have been modified by every language group in Europe. Doesn't the difference between A and B, for instance, seem to argue for encoding two different original works (or written by different dialect-speaking scribes) in closely related 'dialects'? And could linguists perhaps derive some clues from the very frequent 89, or eva dy, in B as opposed to A, that seems to be a common ending in one but not the other?
As an aside but continuing thus subject, I do think I remember reading as well that the 40 construction in the VMS is much more frequent in some sections than others. Some observors have thought this might translate to "qu". If so, might the difference between the two " dialects" be the difference between classical Latin and slightly more vulgar Latin that was using quod, quia, etc for clauses very frequently. Both Latins were used at the same time.
I don't want to isolate Latin, btw, just using it as an example.
Quote:I guess I find it somewhat dismaying that a different code or cipher might have been used throughout the manuscript; I'd rather believe in a slight shift of dialect in the same language! But is what I've said here wishful thinking or a possibility?
Barbrey > 15-07-2021, 01:28 PM