RE: An attempt at extracting grammar from vord order statistics. - davidd - 19-05-2025
(19-05-2025, 10:50 PM)Ruby Novacna Wrote: You are not allowed to view links. Register or Login to view.Thank you, Davidd, for your explanation. I understand the first step: you are looking for the words before and after the word qokeedy. There are 306 occurrences of qokeedy. Do you have a list of words that go with them? yes that is correct, although i dont store those lists, i generate them each time in full, for every vord.
Code: ===== vord qokeedy =====
vord: qokeedy
count: 302
before: {'shedy': 16, 'sokolytedy': 1, 'qokedy': 14, 'tochey': 1, 'a[g:j]': 1, 'dchdaldyykedy': 1, 'lkedylkeeody': 1, 'qotedy': 6, 'okeedchsy': 1, 'sheedy': 9, 'okedy': 2, 'shey': 4, 'chcthy': 2, 'lchedy': 4, 'shckhy': 1, 'olchey': 1, 'keedy': 1, 'lol': 1, 'yshedy': 2, 'sokeedy': 1, 'solkedy': 1, 'ykeechy': 1, 'solchedy': 2, 'okeedy': 9, 'checkhy': 1, 'yksheol': 1, 'chedy': 10, 'dchedy': 2, 'chey': 6, 'da': 1, 'lteedy': 2, 'lo': 1, 'lshedy': 2, 'cheedy': 4, 'shckhey': 1, 'alchedy': 1, 'shcthey': 1, 'ched': 1, 'qolcheedy': 2, 'sheety': 1, 'qotaiin': 1, 'lsheey': 1, 'sheal': 2, 'olshedy': 2, 'qoteedy': 2, 'cheeol': 2, 'd[o:a]lchl': 1, 'arol': 1, 'chdy': 3, 'sheekey': 1, 'sheol': 2, 'oky': 1, 'deedy': 1, 'ol': 3, 'dshedy': 1, 'oteedy': 4, 'ory': 1, 'qokshedy': 1, 'otaram': 1, 'qokshey': 1, 'kedy': 2, 'sheey': 1, 'rolchey': 1, 'yteedy': 2, 'olkshey': 1, 'ykeedy': 2, 'dag': 1, 'sheky': 1, 'okchdy': 1, 'shecthy': 2, 'chety': 1, 'okaly': 2, 'dolfchedy': 1, 'polshdy': 1, 'chckhy': 1, 'lkchedy': 1, 'olchedy': 2, 'dy': 1, 'okain': 1, 'lar': 1, 'tshey': 1, 'daldy': 1, 'olfchedy': 1, 'okai': 1, 'olkeedy': 4, 'qokchedy': 1, 'otchedy': 3, 'start': 3, 'olkshedy': 1, 'qoal': 1, 'qolteedy': 1, 'psholpchcfhdy': 1, 'schdy': 1, 'olain': 1, 'teedy': 1, 'dsheey': 1, 'qokchey': 1, 'tshy': 1, 'ody': 1, 'tchedy': 2, 'cholkaiin': 1, 'shdal': 1, 'shedain': 3, 'qokshy': 1, 'olldy': 1, 'sho': 1, 'ockhedy': 1, 'pchedy': 1, 'chol': 1, 'qoeeey': 1, 'otaiin': 1, 'otedy': 5, 'qoeey': 1, 'orchsey': 1, 'dsheol': 2, 'shaiin': 1, 'odain': 1, 'ksheody': 1, 'okeey': 4, 'okcheey': 1, 'qockhy': 1, 'okeody': 1, 'qokey': 2, 'kedydy': 1, 'qokeey': 9, 'sheedain': 1, 'lkedy': 1, 'loety': 1, 'chedeey': 1, 'shed': 1, 'sholkeedy': 1, 'sheeal': 1, 'keear': 1, 'cheol': 1, 'dsheeo': 1, 'oteey': 2, 'dchedshey': 1, 'teey': 1, 'okeeol': 1, 'ar': 1, 'chody': 1, 'qokeokedy': 1, 'odys': 1, 'sheokeedy': 1, 'sheeol': 1, 'ykeedain': 1, 'lkeedy': 1, 'oteo': 1, 'tshedy': 1, 'oarorold': 1, 'dchedain': 1, 'qoteeedy': 1, 'kody': 1, 'ykeeochody': 1, 'otedain': 1, 'lchey': 1, 'rshey': 1})
after: {'qoteedar': 1, 'qokedy': 11, 'dar': 3, 'qokaiin': 5, 'okedy': 3, 'okeey': 5, 'qokody': 1, 'chetedar': 1, 'cheteyoteeod': 1, 'qokal': 5, 'shey': 1, 'qoky': 4, 'qokair': 1, 'shy': 1, 'lolchedy': 1, 'chedy': 7, 'qokey': 3, 'qokechdy': 1, 'ldy': 1, 'olkeedy': 4, 'oteedy': 4, 'qoteey': 2, 'qotar': 1, 'qok[o:a]l': 1, 'daiin': 2, 'qopchedy': 2, 'checthy': 1, 'qotey': 2, 'okain': 1, 'oty': 1, 'olyshey': 1, 'olkedy': 1, 'ochedy': 1, 'okeedy': 4, 'qotal': 2, 'qol': 4, 'cheedy': 2, 'shl': 1, 'qotchy': 1, 'saiin': 2, 'qolkey': 1, 'lshedy': 1, 'lol': 4, 'qotedy': 6, 'shedy': 6, 'lchey': 3, 'qokar': 6, 'qokol': 1, 'qokeey': 9, 'qodykey': 1, 'dal': 2, '[r:s]al': 1, 'qokam': 1, 'otedy': 3, 'kedy': 1, 'kchol': 1, 'sheey': 1, 'otaram': 1, 'qotain': 3, 'qokain': 5, 'qokechey': 1, 'ror': 1, 'rolchey': 1, 'yteedy': 2, 'ykeey': 1, 'qopcheololkoiin': 1, 'checkhy': 2, 'saltar': 1, 'oly': 1, 'qokail': 1, 'qoker': 1, 'shky': 1, 'tchdy': 1, 'chcphey': 1, 'qokalcthol': 1, 'lchedy': 4, 'cheey': 1, 'ralchey': 1, 'lchy': 1, 'lcheedy': 1, 'cheal': 1, 'rshedy': 1, 'lochedy': 1, 'qotaiin': 1, 'rag': 1, 'qokedal': 1, 'rcheey': 1, 'oky': 1, 'chedain': 1, 'qolchey': 1, 'shckhedy': 1, 'qokedar': 1, 'okaiin': 1, 'olkar': 1, 'qolkeedy': 1, 'dy': 1, 'olkey': 1, 'dkedy': 1, 'qoeedy': 1, 'ykedy': 1, 'doltshdy': 1, 'kain': 1, 'pchedy': 1, 'qoedy': 1, 'lor': 1, 'ykeedy': 1, 'qokchdy': 1, 'shol': 1, 'qoteor': 1, 'cheol': 2, 'olkeeshy': 1, 'chedal': 1, 'oteey': 2, 'qotair': 1, 'dcheol': 1, 'laiiin': 1, 'chdor': 1, 'ched': 2, 'otol': 1, 'sail': 1, 'opchor': 1, 'qotokody': 1, 'chotchedy': 1, 'okeol': 1, 'qokokil': 1, 'rary': 1, 'chody': 1, 'chokedy': 1, 'sheoky': 1, 'cholcheey': 1, 'oteor': 1, 'qokoy': 1, 'okeeom': 1, 'chokeedy': 1, 'lchdy': 1, 'qokeo': 1, 'lky': 1, 'raraiin': 1, 'shok': 1, 'okedain': 1, 'chea[?:m]': 1, 'chey': 2, 'qoteedy': 1, 'teedy': 1, 'otchedey': 1, 'oteolair': 1, 'okeodain': 1, 'qoteedaiin': 1, 'chkal': 1, 'dl': 1, 'key': 1, 'oteeolkeey': 1, 'chckhy': 1, 'qokeol': 1, 'lchsl': 1, 'chokain': 1, 'qochaiin': 1, 'chedaiin': 2, 'qokeeey': 1, 'eeed[ee:a][g:d]': 1, 'qoteosam': 1, 'chkaiin': 1, 'cheky': 1, 'shdy': 2, 'otedar': 1, 'olar': 1, 'otchey': 1, 'chdodaiin': 1, 'okaly': 1, 'olkeechdy': 1, 'qoteo': 1, 'qoaiin': 1, 'chols': 1, 'lxor': 1, 'chodain': 1, 'lkeedas': 1, 'shckhy': 1, 'chtain': 1, 'raiin': 1})
RE: An attempt at extracting grammar from vord order statistics. - davidd - 20-05-2025
apologies still learning
![[Image: quireM2.svg]](https://www.stack.nl/~davidd/quireM2.svg)
![[Image: quireT2.svg]](https://www.stack.nl/~davidd/quireT2.svg)
You are not allowed to view links. Register or Login to view.
RE: An attempt at extracting grammar from vord order statistics. - Ruby Novacna - 20-05-2025
I thought your program was supposed to output a list of 306 groups of two or three words, with qokeedy in the middle or at the beginning.
You also combined some words separated by the drawings, which resulted in several unusual words. I also couldn't find the word sokolytedy.
RE: An attempt at extracting grammar from vord order statistics. - MarcoP - 20-05-2025
Hi Davidd, I am happy to see new research on this subject. I believe there are patterns that can be discovered with the POS-approach (though I haven't been very successful).
I have a few suggestions and a question:- submit your research to the Voynich Day presentation on August 4; it doesn't have to be perfect, you could just describe in more detail what you have done so far
- add a few more words as labels of word classes, e.g. I believe that "shedy" includes "chedy" and vice-versa - it would be nice if this was clear from the graphs
- try your method at least on a simple English text with the same word count as the Voynich samples you are processing; this will tell you what are the results you can expect in a best-case scenario. A more complex English text (e.g. Shakespeare) would also be interesting, for comparison.
- Question: why the double arcs between "aiin" and "otaiin"? The "aiin" loop is also doubled, but in that case the % is the same, so it's less of a problem [EDIT: I guess it could be because you are plotting an arc for "aiin" as a class following "otaiin" and another arc for "otaiin" as a class preceding "aiin"; a possible solution could be only plotting "following" or "preceding" arcs. IIRC my approach was computing % as the rate or total word transitions on the whole text, rather than the rate of "aiin" following "otaiin" on the total of "aiin.X" couples]
RE: An attempt at extracting grammar from vord order statistics. - dashstofsk - 20-05-2025
(19-05-2025, 11:33 PM)davidd Wrote: You are not allowed to view links. Register or Login to view.'a[g:j]': 1, 'dchdaldyykedy': 1, 'lkedylkeeody': 1,
Your approach is very intriguing. But here note that a[g:j] is not a word. Probably the writing is not clear in the manuscript. It might be ag or aj. Also dchdaldyykedy, lkedylkeeody are very rare. In fact they are so rare that they don't seem to appear in ZL3a-n. ( I suspect your program is not handling '<->' correctly. Have you considered working with GC2a-n. This is a simpler transliteration that doesn't have '<->'. ) But is your program really looking at all words? Might it be better if you were to exclude unknowns and rare words. Perhaps then it would not take 'one hour on my poor old pc'.
RE: An attempt at extracting grammar from vord order statistics. - dashstofsk - 20-05-2025
Are the incoming and outgoing lines supposed to sum to 100%? It doesn't seem to be so. Otherwise what do the values mean?
RE: An attempt at extracting grammar from vord order statistics. - davidd - 20-05-2025
I am learning how to make better images, the spaces between vords seem to disappear
The percentages are either percent incoming or percent outgoing from the other block. To reduce arrows i only draw arrows that havea higher score than just expected on basis o f the frequency. Drawing all arrows is pointless, better to use transition table then.
I have these hovering blocks on the left side, i guess i need to look into how ReneZ cleans his transliteration with his tool
![[Image: quireT4.svg]](https://www.stack.nl/~davidd/quireT4.svg)
^^ this is quire 20
RE: An attempt at extracting grammar from vord order statistics. - Rafal - 20-05-2025
My few thoughts about these results.
They seems solid and rational. But they definitely need some filtering for interesting stuff and interpretation.
In the past we have seen similar statistics with some words going together or making some clusters. But nobody made a step further - giving these results some meaning.
Will you be that guy Davidd?
RE: An attempt at extracting grammar from vord order statistics. - davidd - 20-05-2025
(20-05-2025, 03:45 PM)Rafal Wrote: You are not allowed to view links. Register or Login to view.My few thoughts about these results.
They seems solid and rational. But they definitely need some filtering for interesting stuff and interpretation.
In the past we have seen similar statistics with some words going together or making some clusters. But nobody made a step further - giving these results some meaning.
Will you be that guy Davidd? 
I tried to look for previous similar work but the information about actual work is very disorganised.
What is currently easy to find is a list of wrong solutions.
letter frequency and vord frequency is done by voynichese.com
some background info is on voynich.nu
If i had know that similar work had been done on Quires 13 and 20, i would have pushed publishing a little back probably.
todo:
improve images/output
I intend to clean up the results, and do some chi square on the grammar.
make a score function
finding out how many groups/clusters i would need
compare grammars of different scribes, quires, sections.
test grammars with all vords, including singlets.
extend vordscore to include 2 vords before and 2 vords after instead of just the ones right beside it.
extend groupscore to include measuring vords that are not labeled as belonging to one of the groups (the less frequent vords)
refactor the code into more classes to make it more readable.
make the reports in html or phpbb for easier sharing.
possibly: test grammar findings against other 15th c texts in known languages.
RE: An attempt at extracting grammar from vord order statistics. - davidd - 20-05-2025
(20-05-2025, 02:55 PM)davidd Wrote: You are not allowed to view links. Register or Login to view.I am learning how to make better images, the spaces between vords seem to disappear
The percentages are either percent incoming or percent outgoing from the other block. To reduce arrows i only draw arrows that havea higher score than just expected on basis o f the frequency. Drawing all arrows is pointless, better to use transition table then.
I have these hovering blocks on the left side, i guess i need to look into how ReneZ cleans his transliteration with his tool
![[Image: quireT4.svg]](https://www.stack.nl/~davidd/quireT4.svg)
^^ this is quire 20
made a lotr of improvements to the image, fixed a bug about the floating boxes, removed most of the arrows
|