01-01-2017, 08:14 PM
Let us assume a set of rules for the VMS where you get another word of the manuscript by applying a rule to another word:
- The most basic rule is that a glyph can be replaced with a similar shaped glyph. For instance for a word containing t most probably also a word containing k instead of t exists. Because of this rule you can expect beside the word qotol also the word qokol.
- A second rule is that nearly every glyph can be removed. Because of this rule you can expect beside the words qokol also the words okol, kol and ol.
- A third more complex rule is that you can add a glyph at the beginning of a word. For instance it is possible to add an o in front of a common word. In this case the letter next to o is in round about eight out of ten cases replaced with a gallow glyph. Because of this rule you can find beside chedy also words like okedy, otedy, qokedy and qotedy. But since o or o+gallow is sometimes added without removing the next glyph also the word ochedy and okchedy exists.
Beside this three rules some more rules exists. For instance a word can be duplicated. Because of this rule beside the word dy also the word dydy and dydydy exists in the VMS.
But in this post I only want to demonstrate that with the first three rules are good way to describe the words in the VMS. For this purpose I will use the most frequent words daiin, ol and chedy as example. (For calculating the word frequencies I use the transcription of Takeshi Takahashi.)
For the most frequently used word daiin it is possible to build a tree of similar words starting with o- or qo-. The frequencies for this words are very characteristic:
Level 1: daiin (863 times)
Level 2: okaiin (212 times) | otaiin (154) | odaiin (60)
Level 3: qokaiin (262 times), kaiin (65) | qotaiin ( 79), taiin (42) | qodaiin (42)
There are more words starting with ok- or ot- then with od- and the word starting with qok- is even more frequent then the word starting with ok.
Surprisingly the same characteristic pattern can also be found for words similar to chedy:
Level 1: chedy (510 times)
Level 2: okedy (118 times) | otedy (155) | ochedy (8)
Level 3: qokedy (272 times), kedy (44) | qotedy ( 91), tedy (42) | qochedy (2)
For the word ol the o is not replaced with a gallow glyph. Instead ok- or ot- was added in this cases.
Beside this specific feature the tree for ol also shares the same characteristic:
Level 1: ol (537 times)
Level 2: okol ( 82 times) | otol (86) | ool (-)
Level 3: qokol (104 times), kol (37) | qotol (47), tol (48) | qool (4)
It is possible to find this pattern also for other words if they are frequently enough that also variants starting with qo- and o- exists for them. Here is a list of words I have checked so far:
What does this mean? It is possible to describe the words in the VMS by there relations. With the rules found it is not only possible to predict the existence of a word it is also possible to predict the frequency of a word.
- The most basic rule is that a glyph can be replaced with a similar shaped glyph. For instance for a word containing t most probably also a word containing k instead of t exists. Because of this rule you can expect beside the word qotol also the word qokol.
- A second rule is that nearly every glyph can be removed. Because of this rule you can expect beside the words qokol also the words okol, kol and ol.
- A third more complex rule is that you can add a glyph at the beginning of a word. For instance it is possible to add an o in front of a common word. In this case the letter next to o is in round about eight out of ten cases replaced with a gallow glyph. Because of this rule you can find beside chedy also words like okedy, otedy, qokedy and qotedy. But since o or o+gallow is sometimes added without removing the next glyph also the word ochedy and okchedy exists.
Beside this three rules some more rules exists. For instance a word can be duplicated. Because of this rule beside the word dy also the word dydy and dydydy exists in the VMS.
But in this post I only want to demonstrate that with the first three rules are good way to describe the words in the VMS. For this purpose I will use the most frequent words daiin, ol and chedy as example. (For calculating the word frequencies I use the transcription of Takeshi Takahashi.)
For the most frequently used word daiin it is possible to build a tree of similar words starting with o- or qo-. The frequencies for this words are very characteristic:
Level 1: daiin (863 times)
Level 2: okaiin (212 times) | otaiin (154) | odaiin (60)
Level 3: qokaiin (262 times), kaiin (65) | qotaiin ( 79), taiin (42) | qodaiin (42)
There are more words starting with ok- or ot- then with od- and the word starting with qok- is even more frequent then the word starting with ok.
Surprisingly the same characteristic pattern can also be found for words similar to chedy:
Level 1: chedy (510 times)
Level 2: okedy (118 times) | otedy (155) | ochedy (8)
Level 3: qokedy (272 times), kedy (44) | qotedy ( 91), tedy (42) | qochedy (2)
For the word ol the o is not replaced with a gallow glyph. Instead ok- or ot- was added in this cases.
Beside this specific feature the tree for ol also shares the same characteristic:
Level 1: ol (537 times)
Level 2: okol ( 82 times) | otol (86) | ool (-)
Level 3: qokol (104 times), kol (37) | qotol (47), tol (48) | qool (4)
It is possible to find this pattern also for other words if they are frequently enough that also variants starting with qo- and o- exists for them. Here is a list of words I have checked so far:
Code:
Words similar to daiin, dain, dair:
Level 1: daiin (863 times)
Level 2: okaiin (212 times) | otaiin (154) | odaiin (60)
Level 3: qokaiin (262 times), kaiin (65) | qotaiin ( 79), taiin (42) | qodaiin (42)
Level 1: dain (211 times)
Level 2: okain (144 times) | otain (96) | odain (18)
Level 3: qokain (279 times), kain (48) | qotain (64), tain (16) | qodain (11)
Level 1: dair (106 times)
Level 2: okair (22 times) | otair (21) | odair (-)
Level 3: qokair ( 17 times), kair (14) | qotair ( 6), tair (13) | qodair (3)
Words similar to chedy, chey, cheey, cheedy:
Level 1: chedy (510 times)
Level 2: okedy (118 times) | otedy (155) | ochedy (8)
Level 3: qokedy (272 times), kedy (44) | qotedy ( 91), tedy (42) | qochedy (2)
Level 1: chey (344 times)
Level 2: okey (63 times) | otey (57) | ochey (8)
Level 3: qokey (107 times), key (14) | qotey (24), tey (11) | qochey (6)
Level 1: cheey (174 times)
Level 2: okeey (177 times) | oteey (140) | ocheey (3)
Level 3: qokeey (308 times), keey (44) | qoteey ( 42), teey (20) | qocheey (2)
Level 1: cheedy ( 59 times)
Level 2: okeedy (105 times) | oteedy (100) | ocheedy (1)
Level 3: qokeedy (305 times), keedy (53) | qoteedy ( 74), teedy (13) | qocheedy (-)
Words similar to dy and chdy:
Level 1: dy (270 times)
Level 2: oky (102 times) | oty (115) | ody (47)
Level 3: qoky (147 times), ky (25) | qoty ( 87), ty (16) | qody (17)
Level 1: chdy (150 times)
Level 2: okchdy (21 times) | otchdy (30) | ochdy (1)
Level 3: qokchdy ( 56 times), kchdy (20) | qotchdy (23), tchdy (15) | qochdy (-)
The words similar to ol, dol, or, dor, ar, dar, al and dal:
Level 1: ol (537) | dol (117)
Level 2: okol ( 82 times) | otol (86) | ool ( -) | odol ( 2)
Level 3: qokol (104 times), kol (37) | qotol (47), tol (48) | qool ( 4) | qodol ( 1)
Level 1: or (363) | dor (73)
Level 2: okor ( 34 times) | otor (46) | oor ( 3) | odor ( 8)
Level 3: qokor ( 36 times), kor (26) | qotor (29), tor (23) | qoor ( 8) | qodor ( 2)
Level 1: ar (350) | dar (318)
Level 2: okar (129 times) | otar (141) | oar ( 16) | odar ( 24)
Level 3: qokar (152 times), kar (52) | qotar ( 63), tar (43) | qoar ( 12) | qodar ( 11)
Level 1: al (260) | dal (253)
Level 2: okal (138 times) | otal (143) | oal ( 3) | odal ( 13)
Level 3: qokal (191 times), kal (23) | qotal ( 59), tal (20) | qoal ( 4) | qodal ( 7)
Words similar to dam, dar, chol and char:
Level 1: dam (98 times)
Level 2: okam (26 times) | otam (47) | odam (6)
Level 3: qokam (25 times), kam (9) | qotam (12), tam (5) | qodam (3)
Level 1: dar (318 times)
Level 2: otar (141 times) | okar (129) | odar (24)
Level 3: qokar (152 times), kar (52) | qotar ( 63), tar (43) | qodar (11)
Level 1: chol (72 times)
Level 2: okchol (15 times) | otchol (28) | ochol (5)
Level 3: qokchol (18 times), kchol (21) | qotchol (13), tchol (13) | qochol (2)
Level 1: char (396 times)
Level 2: okchar (4 times) | otchar ( 6) | ochar (2)
Level 3: qokchar (1 times), kchar (2) | qotchar ( 3), tchar (4) | qochar (1)
What does this mean? It is possible to describe the words in the VMS by there relations. With the rules found it is not only possible to predict the existence of a word it is also possible to predict the frequency of a word.