(28-09-2021, 07:18 PM)nablator Wrote: You are not allowed to view links. Register or Login to view.Hello Patrick,
I'm puzzled by your notations:
ch>cKh (243 tokens): If it is ch before cKh in the same word, there should be more.
n.ch>cKh (77 tokens): Why n? It never precedes cKh. I don't understand.
I'm sorry for not explaining my notations clearly. Let me try this again.
The ">" is a symbol I've been using to distinguish transitional probabilities from bigrams -- i.e., the probability of the part before the ">" being followed by the part after the ">."
By "ch>cKh" I mean the condition or likelihood of any [ch] being followed immediately by [cKh], and not just that [cKh] occurs somewhere later in the same vord. That is, for all 10177 tokens of [ch], 243 of them (or 2.39%) are followed immediately by [cKh]; hence, after any given token of [ch] there's a 2.39% chance the next glyph(s) will be [cKh]. This is based on
ZL_ivtff_1r.txt and limited to text in paragraphs (no labels, radii, circles, etc.).
By "n.ch," I mean the sequence [n] followed by [ch], as an n-gram. I inserted the period to reflect the fact that when [n] is followed by [ch] there's almost always a space inserted between, but maybe it would have been better to write "n,ch" to acknowledge the handful of exceptions. In any case, I count 1287 total tokens of [n.ch], [nch], and [n,ch].
By "n.ch>cKh" I then mean the condition or likelihood of a token of [n,ch] being followed in turn by [cKh]. I count 96 instances like this. So given any token of the sequence [n,ch], the probability of the next glyph(s) being [cKh] is 7.46%.
So what I meant to point out was that, if we disregard spaces, the likelihood of [ch] or [Sh] being followed by a benched gallows appears to go up if the [ch] or [Sh] is preceded in turn by [n] or [y].
I hadn't checked the specific probabilities of [.ch] and [.Sh] being followed by benched gallows (i.e., cases in which [ch] or [Sh] is vord-initial but not line-initial). Easy enough to do, however. I count 5063 tokens of [.ch], 226 tokens (4.46%) of [.chcKh], and 119 tokens (2.35%) of [.chcTh]; and 2606 tokens of [.Sh], 86 tokens (3.30%) of [.ShcKh], and 44 tokens (1.69%) of [.ShcTh]. These are all fairly close to the percentages associated with a preceding [y].
But the percentages associated with a preceding [n] are still noticeably higher. So it looks as though a vord beginning [ch] or [Sh] is significantly more likely to continue with [cKh] or [cTh] if it's preceded by a vord ending in [n] than it is otherwise. Again, apologies if this is old news.