quimqu > Yesterday, 11:35 PM
RadioFM > 7 hours ago
Jorge_Stolfi > 5 hours ago
(Yesterday, 11:35 PM)quimqu Wrote: You are not allowed to view links. Register or Login to view.I’ve been working on the Voynich Manuscript using machine learning and data science.
Quote:each word (token) becomes a node, and we draw edges between them whenever they occur near each other in the text. using a sliding window of 5 tokens. Once you have the graph, you can study things like community structure, modularity, and assortativity
Quote:The same pattern appears when comparing character entropy (morphological freedom) [...] It might reflect a controlled or encoded version of natural language, or simply a writing system with its own conventions. It’s also interesting that the same language can produce very different results depending on the type of text
Quote:English (Culpepper)
Rafal > 3 hours ago
RadioFM > 3 hours ago
(3 hours ago)Rafal Wrote: You are not allowed to view links. Register or Login to view.So my question is - are these actually good measures if text is meaningful or gibberish?Not really, because meaningful text can display a whole gamut of these metrics depending on the language, encoding, content and writing styles, it seems.
(3 hours ago)Rafal Wrote: You are not allowed to view links. Register or Login to view.You say (like most people) that Voynich has a non-random structure. But on both graphs it appears really close to Torsten Timm text.I think they meant 'random' as in 'unpredictable'. TT's is gibberish but not fully random in that sense, it's more mechanical.
quimqu > 43 minutes ago
(7 hours ago)RadioFM Wrote: You are not allowed to view links. Register or Login to view.It'd be interesting to plot the probability distribution of the degree of each node. I think it could paint a bigger picture than cond. entropy alone. I expect Voynichese to have a more multimodal-like distribution than the other corpora, but it's just a wild guess.
Great job by the way
quimqu > 33 minutes ago
(5 hours ago)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.The name is "peper" not "pepper". I know because I made the same mistake...
)quimqu > 27 minutes ago
(3 hours ago)Rafal Wrote: You are not allowed to view links. Register or Login to view.So my question is - are these actually good measures if text is meaningful or gibberish? What do they actually measure?