The Voynich Ninja

Full Version: sh_ and ch_ compose the same words
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10
Based on EVA. 

1. 
I assume that sh_ and ch_ words are the same, because they behave the same and have same contacts

2.
If sh... words are always compacter than ch..words in the entire text,

3.
could this strongly signal that sh...words are compacter because the diacritic mark above signals an abbreviation?

4.
if the answer is yes, can we assume that sh... words are abbreviated versions on the same ch... words and what letters could be compacted there?
Those are a lot of questions.

I will answer just two of them:

1) The ch and sh do not behave the same in the VMS. This is based on my own research.

Could the cap on sh be an abbreviation symbol?

2) In Latin this is usually an abbreviation symbol, quite a common one. But is it an abbreviation symbol in the VMS? I don't know.

In the VMS, it is sometimes written like a curved macron and sometimes like a superscripted Latin 9 abbreviation (-us/-um) but this might not have anything to do with its meaning, it might only mean that the scribes were accustomed to writing in Latin and sometimes reverted to familiar movements of the hand, so the distinction has to be studied on a statistical level for now.
Thank you. My points have a logical flow, as shown by using the word "this" in 3)  and "if" in 4).

If one disagrees with 1) and 2) one can still agree with 3) 
because that concerns the conclusion that compactness can only signal abbreviation and nothing else.  

I am interested in the method of drawing such conclusions on letters and syllables. Also interesting to see how people react on compact statements and to see if one is willing to draw conclusions and follow a certain path, and see where it leads even if one disagrees with parts of it.
(05-11-2019, 08:56 PM)Davidsch Wrote: You are not allowed to view links. Register or Login to view.1. 
I assume that sh_ and ch_ words are the same, because they behave the same and have same contacts

Sh and ch are generally very You are not allowed to view links. Register or Login to view.. By "contact" do you mean the next glyph?

Quote:2.
If sh... words are always compacter than ch..words in the entire text,

If by compacter you mean shorter, by average EVA length of word type or word token, yes.

Quote:3.
could this strongly signal that sh...words are compacter because the diacritic mark above signals an abbreviation?

The difference in average length (less than 0.3 by my count) would only be evidence of something always missing in sh... words if both sets of words had a reason to have the same average length. In the case of an abbreviation, sh... and ch... words are not the same so there is nothing that suggests that the two sets of words should have the same average length. Even if such a reason existed, it could be argued that an abbreviation would offset the average length by at least one.
This graph is based on Takahashi's transcription. All ch- and Sh- words with at least 50 occurrences are represented, together with their Sh-/ch- counterpart (which sometimes has less than 50 occurrences). I think that these numbers confirm the "replaceability" pointed out by nablator, and that they suggest that the two members of each couple are related.
E.g. chedy (501 occurrences) appears to be related with Shedy (427 occurrences). The two words tend to occur in the same pages. For instance You are not allowed to view links. Register or Login to view., we find reduplicating and quasi-reduplicating sequences like shedy.shedy.chedy and chedy.shedy.

[attachment=3658]

I believe this is somehow similar to what happens with ok- vs qok-. Like qok- adds something to ok-, Sh- adds something to ch-, but in this case the addition is not a whole character like q, but a single stroke, the superscript-curl above the bench. If the script is phonetic, the curl could be equivalent to a whole character, or a few characters, and be an abbreviation; or it could be some kind of lesser modification, like an accent or an umlaut.
The gallows have a specific different behavior and that is the main reason I would like to focus on these two. 
If the replies here deviate much from my research, conclusions and assumptions there is no point in going towards more complex ones.

Notes:

by "words" I meant Voynich words.
by "compact" I meant shorter
by "behave" I mean anything that you can assign to it. For example: the inner words letters, the attached words left and right. the prefix, suffix etc.

Of course there are some differences, but if you compare the amount of differences between CH and SH this is very small.
I have a partly written blog about this based on notes I jotted down in 2008. I am never going to have time to finish it or to create the graphs. I have 41 other unfinished blogs I need to get done, so instead I'll just post a very short textual summary of what I observed.


Frequency and Similarities:
  • The c[font=Eva]h[/font] patterns are about twice as frequent as the sh patterns (it's slightly more than twice, but twice is close enough to observe that the percentage balance remains about the same even when you look at the text that follows these glyphs).

  • When you look at the statistics for common patterns, like chy, chey, chol, chor, chod, cheey, etc., the percentage frequency and order is very similar to shey, shy, shol, sheey, shor, etc.

  • When you look at the statistics for less common patterns, like chos, chaiin choT, choy, etc., the percentage frequency and order is somewhat similar to shos, shaiin, shoT, shoy, etc.
So, e[font=Tahoma, Verdana, Arial, sans-serif]xcept for [font=Eva]c[font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font] being more frequent, the most common sequences do look like they might be interchangeable if one looks only at the patterns that follow c[font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font] or sh.[/font][/font]

[font=Tahoma, Verdana, Arial, sans-serif]Beginning position is somewhat similar in overall numbers, but a bit higher for sh in terms of percentages:[/font]
  • [font=Tahoma, Verdana, Arial, sans-serif]About 53% of [font=Eva]c[/font][/font][font=Tahoma, Verdana, Arial, sans-serif][font=Tahoma, Verdana, Arial, sans-serif][font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font][/font][/font] sequences  are at the beginning of tokens (close to 6,000 instances).
  • About 70% sh sequences are at the beginnings of tokens (a little more than 3,000 instances).
This is probably not a big enough difference to be significant, but I mention it for the record.



[font=Tahoma, Verdana, Arial, sans-serif]But... when you look at what precedes [font=Eva]c[font=Eva]h[/font] or sh, here is something to consider, which is why I stated upthread that c[font=Tahoma, Verdana, Arial, sans-serif][font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font][/font][/font][/font] and sh behave differently...


Preceding and following sequences differ in their general patterns. Here are some examples:

  • k precedes shy less than half as often as one would expect if one uses the same general frequency parameters as for chy.

  • precedes shy only 1/4 as often as one would expect if using the same general frequency parameters for [font=Eva]c[font=Eva]hy.[/font][/font]
Similarly, when [font=Eva]ch or sh are preceded by ddsh is almost as numerous as dch (83% rather than approx. 48% if it were similar to "following" patterns)[/font]

When chol and shol follow [font=Eva]k, a difference is noted in the other direction. If the percentages were similar to sequences following sh, one would expect about 38 instances of kshol, but there are only about 9 (depending on transcript).[/font]


Summary
  • [font=Eva]sh[/font] sequences are slightly more likely to be at the beginning of tokens than ch[font=Tahoma, Verdana, Arial, sans-serif][font=Tahoma, Verdana, Arial, sans-serif] sequences in terms of percentages.[/font][/font]
  • What follows c[font=Eva]h[/font] or sh is more-or-less similar in both composition and frequency if one takes a broad view of the patterns.
  • What precedes c[font=Eva]h[/font] or sh diverges from the percentages one sees for sequences that follow c[font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font] or [font=Eva]sh [font=Tahoma, Verdana, Arial, sans-serif]and they can diverge in either direction.[/font][/font]


[font=Eva][font=Tahoma, Verdana, Arial, sans-serif]These observations do not negate the possibility that ch and sh are interchangeable (it depends on how one parses Voynichese), but they are worth keeping in mind.[/font][/font]


[font=Eva][font=Tahoma, Verdana, Arial, sans-serif]P.S., I also looked into whether ee was analogous to [font=Eva]c[font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font] or sh, but it appears to have its own patterns and is not as similar to either one as c[font=Tahoma, Verdana, Arial, sans-serif][font=Tahoma, Verdana, Arial, sans-serif][font=Eva]h[/font][/font][/font][/font][/font][/font] and sh are to each other.
JKPs observations are correct. The glyphs 'ch' and 'sh' must be compatible with the shape of the previous glyph and are also influenced by their position within a word. Here are some detailed examples:

Examples for 'ch' / 'sh' in initial position:
chy (155 times) chey (344) chdy (150) chedy (501)
shy (104 times) shey (283) shdy ( 46) shedy (426)

Examples for 'ch' / 'sh ' within a word:
qokchy (69) okchy (39) kchy (29) tchy (24) otchy (48) qotchy (63)
qokshy (10) okshy (10) kshy ( 5) tshy ( 5) otshy ( 4) qokshy (10)

dchedy (27) kchedy (22) tchedy (33) pchedy (34) fchedy (11)
dshedy (36) kshedy ( 6) tshedy ( 8) pshedy ( 3) fshedy ( 2)

Examples for words with two instances of 'ch' / 'sh':
chchy ( 2) chkchy ( 6) chtchy ( 2)
chshy ( 2) chkshy ( 1) chtshy ( 1)
shchy ( 5) shkchy ( 2) shtchy ( 3)
shshy ( 1) shkshy ( 1) shtshy ( 2)

It is also interesting to check the occurrences within the VMS. For instance <chkshy> occurs together with <shkchedy> and <chckhey> on page You are not allowed to view links. Register or Login to view. and <shkchy> occurs together with <chpchy>, <shetsho>, and <chepchy> on page You are not allowed to view links. Register or Login to view..
I do *not* have other observations of course,  than the facts,  given by counting things in the transcriptions!

It is clear after all these years, that the difference is in the perception, not in the numbers.

The amount of votes in the poll, shows that interest is low, and the typical answers convinced me, that although intentions might be good, the only way to elaborate on perception is by exchanging data, and for that, I find this forum not the best way to do so.  Sorry, I could publish everything I have done, and fill the entire forum, but I think it is better to work on it silently some more years and spend my hours like that.
I can't really answer the poll. You asked if ch and sh were similar.

  • Yes, in terms of shape they are similar. They are more similar to each other than they are to the other shapes.
  • Yes, in terms of what follows, they are similar (based on the numbers).
  • No, they are not similar in terms of how often they fall at the beginnings of tokens, but is the difference enough to be statistically different?
  • No, in terms of what precedes them, they are dissimilar (based on the numbers).
Pages: 1 2 3 4 5 6 7 8 9 10