This seems to make sense to me, however, I know nothing about this stuff at all so offer it to those who do.
It seemed to me once you think of the VM diagram from a "what's up?" perspective, it makes sense (given the below).
That is to say we are looking up at the sky at what we can see and can't see. It's about as simple as I've found a VM image to be, so undoubtedly it is incorrect.
My understanding is that by 14XX this was still applicable knowledge to most who were not an astrologer.
"Isidore of Seville".
From - "The cosmos and its parts"
((c. 560–636) and translated).
"Portals"
The sky has two portals: the East and the West, for the sun
enters through one portal and withdraws through the
other.
"Unknown paths beyond east/west portals"
lii. The path of the sun (De itinere solis) The sun, when
it rises, holds a path through the south. Afterward, it
goes to the west and plunges itself into the Ocean, and it
travels unknown paths under the earth, and once again
runs back to the east
"Air and then Sky (above)"
Sometimes the word ‘sky’ is used for the air, where
winds and clouds and storms and whirlwinds arise.
Lucretius (cf. On the Nature of Things 4.133):
The sky (caelum), which is called air (aer).
And the Psalm (78:2; 103:12, Vulgate) refers to “fowls of
the sky (caelum),” when it is clear that birds fly in the air;
out of habit we also call this air, ‘sky.’ Thus when we ask
whether it is fair or overcast we sometimes say, “How is
the air?” and sometimes “How is the sky?”
"Sky is where the sun and moon are"
God embellished the heaven and
filled it with bright light – that is, he adorned it with the
sun and the gleaming orb of the moon, and the glorious
constellations of glittering stars. [In a different way, it
is named from engraving (caelare) the superior bodies.]
2. It is called
L in Greek, after the term QY,
that is, ‘seeing,’ because the air is transparent and clearer
for seeing. In Sacred Scripture the sky is called the firmament (firmamentum), because it is secured (firmare) by
the course of the stars and by fixed and immutable laws.
"the stars are in the ether"
The ether
(aether) is the place where the stars are, and signifies
that fire which is separated high above from the entire
world. Of course, ether is itself an element, but aethra
(i.e. another word for ether) is the radiance of ether;
it is a Greek word. 2. The sphere (sphaera) of the sky
is so named because it has a round shape in appearance. But anything of such a shape is called a sphaera
by the Greeks from its roundness, such as the balls that
children play with.
*It's not entirely certain what the firmament means in this comment (to me), so I opted for "The ether" which seemed very clear, but I suspect they mean much in the same.
New blog post on my site (You are not allowed to view links. Register or Login to view.) - today I'm highlighting the pattern of the second circle in the wheel on 57v - a pattern of 9 symbols. At first glance it seems to repeat four times, but actually [font=proxima-nova]the first two patterns use the “one leg, one loop” glyph as the ninth character, and the second two patterns use the “one leg, [b]two loops[/b]” glyph as the ninth character. [/font]
[font=proxima-nova]I'm sure this is not new information, but I'm having trouble finding relevant research. I've seen this referenced in a chart by M.E. D'Imperio before, as shown in my blog post, though her chart doesn't mention the change. [/font]
Would love any help finding threads or other work about this pattern, and the others in vertical on 49v and 66r (and in the star of 69r).
Previous studies have examined the topics of the Voynich manuscript by looking at the distribution of words or patterns across its pages. However, to the best of my knowledge, there has not yet been a fully automated topic modelling analysis that compares multiple algorithms.
In this work, I present an automated page-by-page topic analysis of the Voynich manuscript using three different models:
LDA (Latent Dirichlet Allocation) – which finds 5 topics
BERTopic – which finds 5 topics
NMF (Non-negative Matrix Factorization) – which finds 3 topics
The goal is to see how each model clusters the pages, whether patterns align with the manuscript’s illustrated sections (Botanical, Astronomical, Biological, Cosmological, Pharmacological, Recipes), and to observe if there are topics that dominate certain sections.
METHODOLOGY: HOW THE MODELS DETECT TOPICS
LDA (Latent Dirichlet Allocation)
LDA treats each page as a bag of words and assumes:
Each page is a mixture of topics (in different proportions)
Each topic is a distribution of words
Through repeated statistical assignments, LDA discovers which words tend to appear together, grouping them into topics. Pages are then assigned the topic (or mix of topics) that best matches their word patterns.
BERTopic
BERTopic uses transformer-based embeddings (BERT) to represent each page as a high-dimensional vector capturing semantic similarity. It then applies dimensionality reduction (UMAP) and clustering (HDBSCAN) to group similar pages. Finally, it extracts the most representative words for each cluster to define topics. This allows for more nuanced grouping, even with subtle vocabulary differences.
NMF (Non-negative Matrix Factorization)
NMF uses a term-frequency matrix (TF-IDF weighted) and factorizes it into two smaller matrices:
One representing topics as weighted combinations of words
One representing pages as weighted combinations of topics
Because all values are non-negative, each page’s topic weights are easy to interpret. The dominant topic for a page is the one with the highest weight.
RESULTS
Each model produces two complementary visualizations:
Timeline Plot (Top)
Horizontal axis (X) = Ordered folios of the Voynich manuscript.
Vertical axis (Y) = Dominant topic assigned to each folio (numbered according to the model).
Color = Illustrated section of the folio (Botanical, Astronomical, etc.).
Marker shape = Topic number.
How to interpret: Clusters of the same marker in the same color band indicate topic consistency within a section. Sudden changes of marker shape within a section may suggest variation or topic.
Heatmap (Bottom)
Rows (Y) = Illustrated sections of the manuscript.
Columns (X) = Detected topics from the model.
Cell value (and color) = Proportion of pages in that section assigned to each topic (normalized so each row sums to 1).
How to interpret: A bright yellow cell (value near 1) means that almost all folios in that section belong to a single topic → high homogeneity. A row with several colored cells means that section contains multiple topics → possible internal diversity or mixed content.
Note 1: topic numbers 1, 2, 3, 4, 5 are not the same topics for all the models. It is just a label for a topic in a model.
Note 2: ordered folios in timeline diagram should be read as "pages". Eg: page 48 should be f25v and 49 should be You are not allowed to view links. Register or Login to view. (as page 1 is f1r)
LDA (5 Topics)
Botanical and Pharmacological sections include all 5 topics, suggesting vocabulary variety and perhaps multiple subthemes.
Astronomical section covers 4 topics (all except Topic 4; Topic 5 appears only in one page).
Biological and cosmological sections are entirely assigned to Topic 1 – extremely homogeneous.
Recipes section is mostly Topic 1, with some pages in Topic 3.
BERTopic (5 Topics)
[size=1][font='Proxima Nova Regular', 'Helvetica Neue', Helvetica, Arial, sans-serif][/font][/size]
Botanical is dominated by Topic 2 (but touches all other topics to some degree).
Astronomical is mostly Topic 4 with some Topic 1.
Biological is entirely Topic 3.
Cosmological uses Topics 3 and 1.
Pharmacological touches all 5 topics.
Recipes uses all topics except Topic 2 (striking, since Topic 2 dominates Botanical and Pharmacological) and leans toward Topics 3 and 1.
NMF (3 Topics)
Botanical spans all 3 topics, but is dominated by Topic 2 up to around page 48 (folio 24) before alternating among the three.
Astronomical is entirely Topic 3.
Biological is entirely Topic 1.
Cosmological is entirely Topic 3 (like Astronomical).
Pharmacological alternates between Topics 2 and 3.
Recipes alternates between Topics 1 and 3.
MY THOUGHTS
Across all three models, the Biological and Cosmological sections appear linguistically homogeneous (each model consistently assigns a single dominant topic to them, with at most two topics for the Cosmological section in the BERTopic model). This could reflect genuine stylistic uniformity or simply the models’ sensitivity to repeated patterns in the text. But what if the Cosmological section is in fact closely linked to the Biological section?
Botanical and Pharmacological sections consistently appear more heterogeneous:
LDA and BERTopic detect a wider spread of topics here, possibly due to multiple subsections or thematic variation within the illustrations.
Recipes are particularly interesting: they often share topics with Botanical or Pharmacological sections in LDA/BERTopic, but show different topic distributions in NMF.
A striking observation in BERTopic:
Topic 1 dominates Botanical and Pharmacological, but is absent from Recipes.
This might suggest a shift in terminology or a distinct textual purpose for the Recipes section despite visual similarity to Pharmacological folios.
In NMF:
Topic 3 covers Astronomical and Cosmological sections entirely.
This may mean that NMF sees these two illustrated sections as linguistically similar — perhaps due to formulaic text or repeated glyph patterns.
On August 14, Curator Dot Porter will bring out a facsimile of Beinecke Library MS 408, aka The Voynich Manuscript. This is a realistic facsimile that looks exactly like the real thing. Lisa Fagin Davis will also be there.
August 14, 2025, 12:00pm - 1:00pm ( 5pm GMT )
You are not allowed to view links. Register or Login to view.
There is a series called “Coffee with a Codex” on YouTube. Perhaps this event will also be published there ?
hey here is my take on the script, you can download the paper via link and please comment feedbacks You are not allowed to view links. Register or Login to view.
some description i give here This groundbreaking research proposes a novel decoding of the Voynich Manuscript using African-rooted linguistic models, with comparative analysis drawn from Nilotic, Berber, and West African languages. The paper constructs a plausible phonetic grammar, vocabulary, and morphological syntax based on visual glyph comparisons and contextual semantics from key folios (e.g., f1r, f26r, f34v). The authors integrate:
Cross-cultural glyph analysis with Tifinagh, Ge'ez, and Ajami scripts
Botanical and zodiacal references for semantic grounding
Translation of a full ritual from a zodiac folio (f72v2)
Paragraph-level translations of several pages
A full glossary, grammar rules, and semantic map of repeated glyphs
The paper challenges Eurocentric assumptions about authorship and origin, proposing instead a highly plausible North or West African scholar (possibly Tuareg or Moorish) trained in medicinal botany and astronomy as the original author. A new decoding method is introduced that fuses phonetic patterns, grammatical consistency, herbal-zodiac semantics, and African linguistic parallels. The Appendix provides a standalone linguistic model (Appendix A) and an extended glossary to support independent verification and further research
I am creating a new thread, as my previous thread was more focused on the images in the zodiac pages. But as I noted there, I find it interesting that a lot of the labels has "o" as the first glyph. We know that this is a common prefix, but I still think it is more than usual, so I did some analysis (which probably have been done before, but....)
I use the "H" transcription for this. There are 299 labels, this is because Gemini is missing 1 label: You are not allowed to view links. Register or Login to view.
According to my script 76.9% (230) of the labels start with "o". But this varies amongst zodiac signs. Aries have 29 "o", 1 "c", while saggitarius have only 15 "o". I was wondering if this was perhaps moving with time, but I see no clear indication of that, even though it kind of drops gradually.
The script I wrote output:
aries : 30 lines
Starting letters: c:1, o:29
------------------------------------------------------------
Total zodiacs: 10
Total lines : 299
I have also created a graph to show it more graphically.
As I said in my previous post, I am just looking at stuff. Feel free to guide me in any direction if anyone wants me to look at something specific, right now I am still where this is curious, but I have no "theory" of why this is the way it is.
New here, but been interested in the VM for some time. Today f70v1 got my attention. It is an interesting piece of "art"
What I am wondering about, is that the numbers in the "rings" are quite unusual(?). 10 in the outer ring, and 5 in the inner ring. Not often 10 and 5 is used in mythological/mystic traditions to my knowledge. I know that this probably have been discuseed before (i did a search, but did not find anything related, but sorry if I ignored something I shouldn't have ignored).
From wikipedia:
"f70v–f73v: The astrological series of diagrams in the astronomical section has the names of ten of the months (from March to December) written in Latin script, with spelling suggestive of the medieval languages of France, northwest Italy, or the Iberian Peninsula."
I find this peculiar
The original roman calendar also had 10 months, but they had 304 days in the year - and "these 304 days were followed by an unnamed 50-day winter period" (You are not allowed to view links. Register or Login to view.). That matches the march->december cycle.
Much later, so not related, in the french revolution they divided they year into weeks of 10 days (but 12 months of 30 days - ).
Have anyone found any good reasoning for the 10/5 rings? I am no expert in astrology, and I guess the answer is there - as I understand this is the zodiac sign aries?
Also, I noticed that all the descriptions beside the drawings all start with "o", very often "ot" but sometimes other gallow glyphs. As I always have thought of the post/prefix system to be a case system, or something similar to that. My first thought was the "o" could be something like a instrumental case, or a locative case. Of course this is just pure speculation, and instrumental case is not very common in european languages to my knowledge. Locative existed in latin, so that would probably be known. That said, I don't think the case system necessarily follows known cases - it would not be difficult to create a new case system. I have thought about a language that does not have verbs/nouns, but just differ the words with a case.
a runner -> running
But again, this part is pure speculation. Just throwing everything out
I would appreciate any feedback, especially about the 10/5 stuff. But also about every word starting with "o".
Has there ever been some discussion about the fact that the astrology pages seem to have no depiction of the wandering stars ( planets ). The planets were the most significant of the celestial objects for astrologers. Their unusual passage through the sky has always been a fascination for them, who could, through interpretation of their position, foretell events and the destinies of individual people. Such interpretation of the signs was a valued skill in that time. And rather than being a heretical subject was actually lawfully permitted when in the hands of authorised practitioners. So it does seem a bit odd that the VMS, which seems to portray itself to be a compendium of the secret sciences, should not mention them.
It does suggest that either the authors were unfamiliar with the subject ( in which case it might be equally so with the other topics, which would make the content of VMS empty and valueless ), or did not care too much to make the VMS accurate ( likely under the artificial construction hypothesis ).