| Welcome, Guest |
You have to register before you can post on our site.
|
| Online Users |
There are currently 789 online users. » 6 Member(s) | 780 Guest(s) Applebot, Bing, Google, Ruby Novacna, sfiesta
|
|
|
| Fractional word frequencies per section and type |
|
Posted by: Jorge_Stolfi - 06-12-2025, 09:11 AM - Forum: Analysis of the text
- No Replies
|
 |
I prepared a bunch of files with the fractional word counts per section and text type. These fileslist all words that would appear under any interpretation of the dubious space markers (commas, ","). Se more below. The files are in the attached file st_files.zip.
st_files.zip (Size: 161.57 KB / Downloads: 6)
The files are named "{SEC}.{TYP}.evt" and "{SEC}.{TYPE}.wff"
{SEC} is a major VMS section: "hea" (Herbal A), "heb" (Herbal B), "bio", "cos", "zod", "pha", "str" (Starred Parags). And also "unk" for pages of unknown nature, such as You are not allowed to view links. Register or Login to view. and f86v6.
{TYP} is a type of text: "parags", "labels", "trings" (text in rings), "titles" (short phrases next to parags), "radios" (radial lines in circular diagrams),and "glyphs" (isolated characters). Note that this classification is somewhat different that the one used by Rene and others; for instance, the short paragraphs in the sectors of f67r2 are here classified as "parags" too.
The file {SEC}.{TYP}.evt contains all the lines of section {SEC} and type {TYP}, in a simplified IVTFF/EVMT format, like "<f75r.47;U> sal.okeedy". The transcription used is based on a recent one of my own, from the Beinecke 2014 scans (4162 lines, code ";U"), completed with a version derived from release "RF1b-e.txt" of Rene's IVT (1226 lines, code ";Z"). I removed all inline comments, page headers, and parag markers, and mapped figure breaks to ".". All letters were mapped to lowercase. A few common weirdos were turned into their best approximations, like Rene's "&152;" turned into d and "&222;" into y. All other weirdos were mapped to "?". All ligature braces were removed, so some information may have been lost in rare ligatures.
The file "{SEC}.{TYPE}.wff" has onle line "{COUNT} {WORD}" for each word type (lexeme) {WORD} that occurs in "{SEC}.{TYPE}.evt". The {COUNT} is a fractional number, obtained by assuming that each comma (",") in a line of the transcription may be independently either a "word space" or "no space", with equal probabilities, in all possible combinations. For each combination, each word is counted, not as 1 but as the probability of that combination.
For instance, in the line "chedy.cho,ke,or,ol.daiin.dal,dy", the words chedy and daiin are counted as 1 each, while dal, dy, and daldy have a count of 0.5 each (corresponding to the two interpretations for the comma between them). Also cho and ol have a count of 0.5 each, choke, ke, or, and orol have count of 0.25, and chokeor, chokeorol, keorol have a count of 0.125. Note that the total count for each glyph of the input is still 1.
Using these fractional counts for word-related statistics may reduce biases that may result from either treating all commas as word spaces or ignoring all commas. For instance, dubious spaces often occur after r and s, or after a word-initial y. But this is still a far from perfect solution to that problem. The Scribe himself may have improperly joined or split words, and the transcribers may have omitted many dubious spaces, or entered them as ".".
Please let me know if you find any errors in those files. Also if you would like the (somewhat messy) scripts that I used to create them.
All the best, --stolfi
|
|
|
| Phonetic notation experiment |
|
Posted by: Rafal - 04-12-2025, 10:02 PM - Forum: Theories & Solutions
- Replies (21)
|
 |
We are discussing phonetic notations quite a lot recently. Chinese phonetic, Irish phonetic etc.
I wonder if someone really used a phonetic notation of some language with invented alphabet, would people be able to decipher it at all or not.
I made some test. I have written down a text in modern English using ortography of another language X. Something like writing "hani" instread of "honey".
The text is encoded with simple substitution cipher - one sign in cipher is one sign in alphabet of language X.
Language X has more letters than English but not much.
Would anybody be willing to try if they can break it with available computer tools, online solvers etc. ? And when you break it, are you able to read and understand the English?
Would it be easy, hard or impossible?
I guess existing computer tools use language dictionaries and here words are "bastardized" so dictionaries may not help.
Would anybody be interested?
It if is very hard with English then we can forget about reading phonetic notations of Asian languages 
Here is the coded text:
Code: cbwmj@ inb@mc sbt fnob#c zh@ hpcmwhenj l$nr$#znb mr $c lh @fo o$@ ft lbe bc #$wfo lhc inb@m zhpb#e b i$obwtrn $nj tfw $c shbw f@ bwt b nf@y
m$#e byfo bc ofc pnhbwnj whlhn% z$# hmc iwb^b@c h@ @b$@%bwm$n ywb#lc %hcp$lbwb% h@ lb eb%hmbwb@h#b@ z$ch@ whifwmb%nj %b#mh@ zbp bw$o@% chpcmj
m$o^b@% #bwc #$wo hc cmhib% h@ ejt b@% nb%gb@% hm hc b inb@m lbm eb@h p$n&^bwc ft lb ofw% sbt o$#%nj #r^% b@% wblhwb% hm hc @fo@ h@ n$mh@ bc
$shnb$ ehnnbtnfwhre b@% ofc @b#e% h@ f@bw ft lb ywhp yf% $shnbc sr bpfw%h@y mr nb%gb@% sb% pfwc mf o$#%nj heinf# lhc of@% cm$@&^h@y sbwz f@
lb z$mbnthn% $@%$zmnj b cflbwb#@ wheb%j ft $obw sbwz$n eb%hch@ p$zh@bm bc #r ohn cr@ ch #$wfo w$#mnj wheb#@c b tb#lbwhm ft iw$pmhc^hf@bwc
ofwph@y ohl inb@m eb%hch@c bnf@yc$#% %$@%bnhf@c b@% inb@mb#@c #$wfo hc b@$lbw ft $obw ynfz$nh blb#b#znb sbwz$n tbwcm b#% inb@mc ohcij
tb%bwj tfnh%g oh&^ cribwthc^h$nj whcbeznbc lb ohn% p$wfm #$wfoc n$ch@hb#m nhlc ohl lbhw th@ b@% t$#@nj %hl$#%b% nfozc yb#l w$#c mf hmc $%bw
pfef@ @b#ec ehntf#n b@% m$r^b@% nht @hr ywfot ohn whbebw%g twfe hmc pwhih@y b@% cmb%hnj ciwb%h@y wh^fec h@ bwnh ciwh@y lhc wrm cjcmbe eh@c
oh wbyrn$wnh t$#@% lb inb@m ywfoh@y bc %b@c e$mc lb zb#^$n nhlc $w c$em$#ec po$#m n$w%g b@% ciw$onh@y fnob#c f@ nf@y ibc^hfnc b@% h@hm#$nj
ywfo h@ b wf^bm ob@ p$eh@y h@mr tn$obw lb cmbe nhlc zhp$e c^fwmbw cbc$#n b@% $nmbw@b#mnj cib#c% #$wfo znrec twfe %gr@ ohl t$wfo% tnfobwh@y
cmbec mjihpnj wh&^h@y s$#mc ft chpcmh cblb@mh cb@mhehmbwc ftb@ whtbwb% mr bc $ezbn n$#ph lb r@mwb#@% $# pr% h@hc^h$nh ehcmb#p #$wfo tnfobwh@y
cmw$p&^bw tfw b@ $ezbn b@% inb#c #$wfo h@ lb p$wfm tbehnh sfoblbw nrp pnfo^nj twfe zhnfo b@% #r ohn fz^bwl @rebwfoc tn$obw cmfpc pf@%b@c% mryb%bw
s$# $i lb cmbe b@% #r ohn ch s$o lb# %r @fm fn fwh%gh@b#m twfe b cb@mw$n if#@m f@ lb cmbe bc ibw $ezbnhtbwfoc inb@mc lb pfeif^hm tn$obwc mb#cm
zhmbw b@% sbl b p$w$pmbwhcmhp eb%hch@$n f%bw #rgobnj #$wfo sbc pwheh o$#m wb# tnfwbmc %bnhpbmnj twb#eh@y lb fwj@%g mh@mb% cb@mw$n %hcp tnfwbmc
z$m ih@p cmwb#@c ft #$wfo ohn twhpob@mnj zh ch@ t$#l fw chpc tnfwbmc $w mjihpnj t$o@% h@ h&^ h@%hlh%r$n tn$obw sb%
|
|
|
| f57v figures |
|
Posted by: anejati - 04-12-2025, 09:08 PM - Forum: Imagery
- Replies (18)
|
 |
In You are not allowed to view links. Register or Login to view. the characters in the middle ring repeat in a 4x17 pattern, and there are also 4 figures inside the circle. Going around the circle, the faces alternate towards and away from the viewer (towards-away-towards-away). The faces looking away have two outstretched hands and the faces looking towards have one raised hand which appears to be pointing at something.
It seems like an obvious pattern but I haven't seen any discussion on this. Is there any idea on the alternating towards-away faces? Do we know what this is supposed to symbolize? Are there parallels in other medieval scripts?
One interpretation is that the figures are supposed to represent four people in a circle in 3d (imagine children playing a game), but then it should have been towards-towards-away-away, not towards-away-towards-away.
|
|
|
| A structural hypothesis: Voynichese as a polysynthetic-like morphological system |
|
Posted by: Astra Lumen - 03-12-2025, 11:54 PM - Forum: ChatGPTPrison
- Replies (3)
|
 |
Hello everyone,
I’m new here. I’m not a linguist or cryptographer, my background is not academic.
This is not a decipherment attempt. What follows is just a structural observation based on pattern-recognition and basic morphological reasoning.
I’ve been looking at Voynichese from a pattern-recognition perspective: not as encoded phonemes, but as possible morphological units.
What caught my attention is that several recurring sequences behave more like morphemes (in the typological sense) than like components of a substitution cipher.
Many analyses have noted that Voynichese has: - lower entropy than a typical monoalphabetic substitution,
- very stable internal word patterns,
- recurring sequences such as qo-, che-, -dy, -iin,
- and clear positional constraints on certain glyph groups.
Instead of treating these as “oddities” of an underlying phonemic text, I wondered what happens if we flip the model:
Core idea (structural, not semantic)
What if Voynichese doesn't encode letters or phonemes at all, but behaves more like a morphological system, where each "word" is a bundle of morpheme-like units (Prefix + stem + suffix), somewhat analogous to polysynthetic or strongly agglutinative languages?
Voynichese “words” may function as semantic/morphological bundles, similar to polysynthetic or highly agglutinative systems, rather than representing a letter-based encoding.
This idea might help explain some well-known features:
• very stable internal structure in many tokens
• frequent recurring sequences (qo-, che-, -dy, -iin)
• strong positional constraints
• low entropy inconsistent with simple substitution
• vocabulary shifts across sections
To illustrate the structural idea (not the semantics), here are a few examples in EVA:
1. The qoke- family: qokedy, qokeedy, qokain, qokaiin, qokal
These share:- Initial element: qo-
- Stem-like core: k(e/a)
- Variable endings: -edy / -dy / -ain / -aiin / -al
This pattern resembles a fixed stem with multiple aspect/state suffixes, common in polysynthetic morphology, where endings encode nuances like iteration, completion, plurality, state, etc.
Even without knowing the semantics, the structure is consistent.
2. The -hedy cluster: shedy, chedy, ychedy, lchedy, okedy (overlapping pattern)
These share:- a stable -hed- / -ched- / -ked- type stem
- variable onsets (s-, ch-, y-, l-, o-)
- a highly stable final element: -dy
In morphology, this is classic behavior of a productive suffix attaching to multiple stems.- Position: final
- Stability: invariant
- Distribution: high frequency
- Combination: attaches widely
These are foundational criteria for morpheme identification in unknown languages.
3. The ol–olol–olkeeody family
These show:- ol
- olol (reduplication-like extension)
- olkeeody (stem expansion + final suffix)
Reduplication and recursive stem-building are common in polysynthetic and agglutinative languages but are unusual in substitution ciphers unless artificially engineered.
The recurrence and structure again suggest morphological productivity.
Why polysynthetic-like?
Not because Voynichese is one of those languages, but because:- tokens behave like semantic bundles rather than phoneme sequences
- many stems appear to be non-phonotactic but internally consistent
- affix-like sequences have clear positional rigidity
- the writing flow looks natural for whole morphological units
This shifts the analytic model away from:
encoded alphabet → encoded syllables → encoded phonemes
and toward:
prefix (class/process marker) + stem (core process/state) + suffix (aspect/iteration/state)
A pseudo-polysynthetic system could be invented, constructed, or hybrid, the origin doesn’t affect the structural behavior.
What this is NOT:- Not an argument about semantics
- Not claiming the text is natural language
- Not claiming decipherment
- Not pushing a specific meaning system
Just a structural possibility that might be testable through:
- morpheme segmentation algorithms
- co-occurrence analysis
- positional modeling
- affix-grammar approaches
- typological comparison (Eskimo-Aleut, Algonquian, Chukotko-Kamchatkan, etc.)
If such analysis has already been attempted, I’d appreciate pointers.
If not, maybe this model offers another angle for people working on statistical or computational methods.
Thanks for reading and for any thoughts.
Astra Lumen
|
|
|
| Glyphs as Joined/Connected text |
|
Posted by: Mark Knowles - 03-12-2025, 06:54 PM - Forum: Voynich Talk
- Replies (3)
|
 |
I have been thinking apart how to distinguish distinct glyphs in the Voynich manuscript and the way that makes most sense to me is to view interconnected or joined symbols to count as one glyph unit. So "aiiin" would count as one glyph. This way of defining glyphs increases their number, but seems more logical, otherwise one has to decide when to disentangle connected symbols and treat them as separate glyphs and this would seem to be very arbitrary and confusing. I know that someone might point to some complex interconnected benched gallows, but even in their case I am inclined to treat them as one glyph.
|
|
|
| F70v2 and autocitation |
|
Posted by: Rafal - 02-12-2025, 11:50 AM - Forum: Analysis of the text
- Replies (7)
|
 |
Something weird (or maybe not weird at all) is going at f70v2 (Pisces zodiac sign)
You are not allowed to view links. Register or Login to view.
otar am otaral otalar otalam otolal
For me it is a very strong case for autocitation hypothesis suggested by Torsten Timm. The scribe is altering the same word and making a gibberish.
It such a case attempts to identify the star (like Alrescha from Pisces constellation) will be futile.
Would you have another explanations?
|
|
|
| [split] Did the VM go straight from cerebellum to vellum? |
|
Posted by: Jorge_Stolfi - 02-12-2025, 06:15 AM - Forum: Voynich Talk
- Replies (32)
|
 |
(01-12-2025, 10:54 PM)qoltedy Wrote: You are not allowed to view links. Register or Login to view.There are other additions or exceptions you could add onto the theory of multiple scribes (it's copying from an earlier text, it's a phonetic transcription, it's oral knowledge passed down) but each of these requires its own leaps in logic and speculative assumptions. If the text had multiple scribes to copy a previous text, who wrote THAT text? Was it one person? Meaning one person wrote the entirety of a Proto-Voynich Manuscript, and then later paid 5 scribes to write it again? For what purpose would someone do it this way, instead of just writing it themselves?
It would be insane to write anything on vellum straight from one's head. It would be like writing a document today with the keyboard connected directly to the laser printer.
Vellum was expensive and difficult to erase from. Moreover that task required an experienced hand capable of writing tiny letters neatly; something that not everyone would have.
Thus I bet that practically every manuscript on vellum, including the VMS, including encrypted letters, was written on (much cheaper) paper first, with all the correcting and crossinging-out that may have been necessary. And only then this paper draft would be copied to vellum.
And this last step was a boring mechanical task that required good "quill driving" skill but no understanding of the text. Thus it must have been usually delegated to a secretary or more-or-less professional scribe, or to "scribal shop" (like a monastery).
Then the VMS Author would be the person who wrote the draft, not the person(s) who put quill to vellum. Most likely, he was only one person for the whole book.
The Author would have to teach the Voynichese alphabet to each scribe, and have the scribe practice until he could copy it satisfactorily well. This point argues against multiple scribes working at the same time. But it would allow for a different scribe for each section (counting Herbal-A and Herbal-B as two sections), if they were composed by the Author in separate epochs, separated by substantial time intervals.
All the best, --stolfi
|
|
|
| Various Graphs and Analyses |
|
Posted by: srjskam - 30-11-2025, 06:11 PM - Forum: Analysis of the text
- Replies (17)
|
 |
I've had two stints of Voynich fever, once in 2015 and the second time a year ago. On the second round I produced some graphs and analyses and thought about posting them here, but never got to it. It keeps bugging me, so I'd better post this stuff to get some peace of mind...
Everything is very unpolished, not nice enough to write blog posts about, but maybe this could inspire someone. Most of these analyses examine a different aspect, but there isn't really enough substance to merit a new thread for each. Mistakes are to be expected. I've mostly done the analyses with Python in JupyterLab (Pandas, MatPlotLib etc), using various transcriptions, mostly Takahashi for older stuff and ZL. I'm usually mostly interested in paragraph type text and omit labelese.
I don't claim anything here is new. Sometimes I went out to replicate old results, and I'm well aware of the fact that in Voynich research 99.9% of all results have been thought of by a dozen people before.
|
|
|
| How many people penned the main Voynichese text? |
|
Posted by: Koen G - 29-11-2025, 05:01 PM - Forum: Voynich Talk
- Replies (32)
|
 |
I'm curious to see how the opinions are divided on the matter of "how many scribes". I like to know this, since for example in a video I might say "the majority view is...", but I need to know whether my impression of the majority view is correct.
This is an anonymous poll, vote for the answer you think is most likely. No need to take the opinions of others into account, nor to be absolutely certain. Just vote how you feel. Discussion in the thread is allowed, of course.
It is only about the main text of the manuscript, so ignore all marginalia, month names, page numbers etc etc. Also ignore any Authors or Masterminds or other background figures. Count only the people who put Voynichese to parchment.
|
|
|
|