(10-07-2020, 08:31 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Maybe the get_vectors.py script creates the model ?
Thank you. I looked at that script also and it seems to create files with a "-" in the name. It could have created models/voynich-2.vec and models/voynich-3.vec but I don't believe it can create models/voynich.bin. I am not a software wizard either: if I decide to look more into these methods, I will search for something with more documentation.
(10-07-2020, 12:26 AM)MichelleL11 Wrote: You are not allowed to view links. Register or Login to view.I would have liked to have seen data similar to Table 1 from a text we knew the meaning.
The following table is based on the models/james1.vec file in the You are not allowed to view links.
Register or
Login to view. linked by Torsten. The vectors are based on texts/james1.txt, this should be
The political works of James I by Charles Howard McIlwain. The text is much longer than the VMS (243677 words).
I simply computed the cosine distance among all couples of vectors and picked the 100 lowest results.
Jacobi Jacobo 9.362794918033046e-05
1600 1602 0.00012912299097533886
FREE PARLIAMENT 0.0001889613950498692
1533 1530 0.00035496688566682977
1609 1608 0.0003943003806157197
1690 1655 0.0004360146351951588
1530 1587 0.00045856802441224254
1681 1684 0.00046081128273234295
Libri Librum 0.00046549843612331276
Moguntiae Librum 0.0005934398930368401
HIS OR 0.0005939489233323103
1648 1655 0.0006160982305256635
EXAMINED FIRST 0.0006207951848548054
Apologia Apologiam 0.0006242717651190333
1613 1617 0.0006285223862874112
Tortura Torturae 0.0006477660046458888
1681 1690 0.0006702024484808167
omnium Regno 0.0006920672551634643
Jacobi Summi 0.0007025262205233584
1614 1617 0.0007323946037528506
1608 1600 0.0007425335851969361
1584 1682 0.0007664663931468141
1681 1584 0.0007680752187415596
1584 1587 0.0007741263426224165
Triplici Dissidium 0.0007796541644139454
v. 1914 0.0007835241408088445
97 37 0.0007885219442038682
1533 1587 0.0008061611665365342
1530 1682 0.0008074178851948943
Regiae Regium 0.0008198226243809614
Catholics Catholic 0.0008240904283860484
mysterie Mysterie 0.0008342033390920101
1558 1530 0.0008363943947071739
1608 1602 0.000839226805137816
1684 1682 0.0008524493236341524
C. 1587 0.0008682539507900433
1681 1682 0.0008950008751288374
1533 1682 0.0008980736412399493
Summi Jacobo 0.0009086594125757852
1610 1617 0.0009392437979307555
Edinburgh 158 0.0009575415342892857
63 65 0.0009582107947709861
Bellarmino Bellarmin 0.0009582491968167517
1612 1613 0.0009683327995170243
1533 158 0.0009701879699102189
1558 1533 0.0009715741598360639
1530 1584 0.0009879434874886517
Omnia omnium 0.0009955304345795613
1615 1617 0.0009972922496109815
v. Syr 0.0009999278965419078
Regno Regni 0.0010038510237211362
1587 1682 0.001028003627235874
1558 1587 0.0010310940219205866
etiam Regno 0.0010363190025481916
PARLIAMENT OR 0.0010377423635691274
1615 1614 0.0010454307414756725
A. 1587 0.001050023109272047
Omnia Regni 0.0010582006286237178
W. 1530 0.001062366005533888
B. Jacobi 0.0010663906430071757
1681 1655 0.0010735820899224757
excommunication Excommunication 0.00107471381451818
FREE OR 0.0010794008820456114
1690 1584 0.0010873438254065393
1914 1588 0.0010933555001633177
Omnia Magnae 0.0010959532312743159
1648 1690 0.0011099579917831504
omnium Regnum 0.0011198874366192824
W. 1587 0.0011265060059496568
guiltie crueltie 0.0011284163198155284
Edinburgh 1533 0.0011433225551614745
1611 1612 0.0011545363446050505
1610 1613 0.0011556526850372562
1612 1617 0.0011630775589848152
C. A. 0.0011679450585275752
1603 1608 0.001176599715703719
attack attacks 0.001181162071998032
feeling medling 0.001184893114971608
1609 1600 0.0011936765166816743
1688 1682 0.0011942271729868947
crueltie Noueltie 0.001202438546565987
1609 1602 0.0012119389179859885
1606 1688 0.0012162574773751933
1584 1655 0.0012284433156166674
D. N. 0.001233932699232776
omnium Regni 0.0012372783075712546
1530 1684 0.0012439587433001886
1558 1682 0.001262306694848725
OF AND 0.0012691735440278906
21 142 0.0012770557684180783
1558 1584 0.0012807002924257738
1613 1614 0.001282499927374725
Omnia Regno 0.0012844189069226575
1681 1530 0.0012870482865621202
omnia omnium 0.0012923522978909308
B. Summi 0.0013121368925527177
Regis Regnum 0.0013262982591054628
1584 1684 0.0013296208585277247
1558 1684 0.0013346804000435863
B. Jacobo 0.0013354074560100182
Though the main text is in English, it appears to contain some Latin. Most of the couples appear to pick the minority languages in the text: Arabic numbers and Latin.
The following table corresponds to the Spanish models/picatrix.vec, which also is much longer than the VMS (136127 words).
encontré encontró 0.0004956167930829647
Cfr 35 0.0006585021287637272
danic danics 0.0006607751842919729
8° 25° 0.000696572187019795
Aquí ¿No 0.0008916732323611676
18 8° 0.0009855960313587264
esencia potencia 0.0010129389304190939
21° 25° 0.0010297147930239392
4° 25° 0.0010449547092218348
organización participación 0.0010857306882536832
17° 51 0.0011836149713667643
inclinación organización 0.0012811370687462187
fósforo toro 0.0013125712261826683
recibe recibir 0.0013596247402983819
21° 8° 0.0014157185404739536
4° 8° 0.0014639031144604298
Un ¿No 0.0014652052435615293
18 21° 0.0014656913772951308
potencias esencias 0.001484891476752792
20 117 0.0015072518398858703
18 25° 0.0015075266523900677
17 34 0.001517155035947515
sábelo Sábelo 0.0015329058682377328
observación organización 0.001625448645835137
alcanfor almizcle 0.00164734566652891
adivinación organización 0.0016481153407539306
4° 17° 0.0016713394945511162
52 8° 0.0016747763340585475
aceptación realización 0.001678686035215482
orín bórax 0.0016869466122287902
21° 4° 0.0017533713445619936
N W 0.0018014616317068022
adivinación participación 0.0018297915531364506
22 23 0.0018577010368092672
Un Aquí 0.0018685626245712461
ámbar gacela 0.0018773552580928499
26 4° 0.001894998960850769
curso individuo 0.0019051051905544236
tamaño brujería 0.0019072090811412812
arcilla olla 0.001911753304834729
maravilloso maravillosa 0.001971683250179934
sésamo olivo 0.0019729420381220386
26 25° 0.0019770138341108634
estómago trapo 0.001985188597385612
diferencia subsistencia 0.001997784103592637
murciélago sándalo 0.00200596358925087
claridad sagacidad 0.0020071368315101035
procede profeta 0.002020651016352848
Te voy 0.002046132045719684
¿Qué Cuzami 0.0020728343233525903
débil útil 0.002085269310519444
templos ejemplos 0.002104497609965561
intimidad facilidad 0.002113999829754354
W B 0.0021221883347225523
17 25 0.0021321485703214016
arrayán lechuga 0.002159064671947597
escribe Dicha 0.0021649488172448272
depende feliz 0.002186018894097641
9 13 0.0021903056682133215
18 52 0.002209399680285329
gemas patas 0.0022280727011942947
autoridad sagacidad 0.0022355253163921507
52 25° 0.0022390752076199005
20 23 0.0022492291268656484
B F 0.0022585023261663117
almáciga orín 0.002262757357885503
11 13 0.0022643686009261588
Q M 0.0022804654334078744
huertos muertos 0.002280886272855387
expande inductor 0.0022856270853731653
vitriolo hinojo 0.0022871587786683634
visiones pasiones 0.002308235579373985
patas uñas 0.0023112112382086547
11 22 0.002330213881865162
pp 35 0.0023500532560534193
objetivo través 0.002351160479116654
inclinación participación 0.002354135089171927
líneas distintas 0.0023608010709631477
murciélago toro 0.0023822515490338203
N B 0.0024091774657688525
magnesio abubilla 0.0024092049755259914
realización preparación 0.002423702096240765
capacidad sagacidad 0.0024240215295200374
arre arroz 0.0024288430315090315
aquélla feliz 0.0024424531694202667
cacharro lechuga 0.002472357046448548
sumisa traza 0.0024737210695034983
máximas dudas 0.0024873313566698974
magnesio bórax 0.0025006619151032305
4° 51 0.0025027445209809818
diferencia prudencia 0.0025073546701376292
altos coptos 0.002515610254106293
aquélla familia 0.00251862958602056
M B 0.002531032454618387
voy uso 0.0025336848733218398
orín pulpa 0.002534521827910252
paja tortuga 0.0025417650767731725
creación duración 0.0025596741484590346
sándalo toro 0.0025630507348739506
ejemplo templo 0.0025665030067522077
It seems to me that the results are dominated by similar couples where the relationship appears to be morphological/grammatical rather than semantic ("magnesio bórax" near the bottom could be an interesting exception).