In this thread, the terms 'normal' and 'binomial' are arbitrarily mixed, which isn't a big problem because we're talking about an approximation of the real word length distribution.
However, it is worthwhile to think more in terms of 'binomial' because of the interesting property of the binomial distribution. I guess everyone is familiar with the simple trick to compute the binomial distribution for 'N' using a pyramid with the number 1 at the top, and moving down to N by adding the two numbers above, one to the left and one to the right.
(It is explained more visually You are not allowed to view links.
Register or
Login to view.).
What does this have to do with this thread?
Let me go in little steps.
Let's assume that we have a vocabulary with a binomial word length distribution.
Now we want to create a new vocabulary out of that, and we do this by optionally prefixing the words of the old vocabulary by the letter 'a', with a probability of 50%. (*)
The new vocabulary has twice the size of the old vocabulary, and its distribution is again binomial.
(note *: this is true under some conditions, and for ease of understanding, let's assume that the old vocabulary had no words starting with a).
The rule to build the pyramid can also be generalised, in a way that is not really practically useful, but makes the binomial property a bit more 'interesting'.
Rather than computing row 'N' from row 'N-1' by adding the numbers above with coefficients "1 1" one can also compute it from row 'N-2' by adding the numbers from that row with coefficients "1 2 1" or from row 'N-3' by adding the numbers from that row with coefficients "1 3 3 1".
These sequences of coefficients are themselves binomial distributions.
This means for our vocabulary example that, if one has two separate 'short' vocabularies each with a binomial distribution, one can make a new vocabulary in which each word consists of a prefix from one and a base from the other. This new vocabulary is then also binomial.
Since the 'base' in this example could have been the result of combining a binomial 'stem' with a binomial 'suffix', we have now found one way to create a vocabulary of binomial distribution, namely by (arbitrarily) combining a prefix, a stem and a suffix which are each also binomial.
If we wanted to do this, and end up with a distribution where the shortest length is '1', then we could for example impose that the stem has to have at least one character, while the prefix and suffix can also be empty (i.e. start with level 0).
This can also be taken to the extreme, by having N components, in a fixed order, each of which may appear with 50% probability.