I like to toss in Guanche - the native language of the Canary Islands, suggested by another commentor. It was a proto-Berber language, long isolated, then foreign influenced, before the islands' conquest just after 1400. The language then went extinct.
(13-01-2026, 08:12 PM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.I think it is more useful to consider how a writing system scores in two respects. For a person with basic command of the language,
(a) reading determinism: how reliably can they correctly pronounce an unfamiliar written word.
(b) writing determinism: how reliably can they write down the correct spelling of an unfamiliar spoken word.
Italian gets maybe 90% score on (a); not 100% because stress is phonemic but is not marked, and has two "e" sounds and two "o" sounds that are not distinguished in writing. ("botte" can have either "o", meaning either "hits" or "barrel"; and "pesca" may be either "peach" or "fishing" depending on the "e"). Offhand I think that it has 99.9% on (b), but I may be wrong.
It's always a pleasure to read from you about linguistic matters (I also liked a lot your post You are not allowed to view links.
Register or
Login to view.).
I can confirm Italian surely does not get 100% on score (a). Beyond "e" and "o", "s" too represents two different sounds ([s,z]) and so does "z" ([ts,dz]), and approximants [j] and [w] reuse the same symbols as [i] and [u] (even if [kw] has its own symbol: "qu"). There are also cases where one cannot be sure if an "i" must be pronounced or not, and rarer cases involving "gl", so it's actually difficult to correctly pronounce Italian from orthography only. Years ago I wrote a program to check how much phonetical Italian is and, surprisingly, only about ~31% of word tokens in an Italian can be exactly pronounced relying only on the orthography. This does not matter to an Italian, in most cases a word will be understood both, say, with an [s] or a [z], but can be confusing for a foreigner. My wife had a colleague coming from Kashmir who asked her how "s" must be pronounced, when it's [s] and when it's [z], and my wife could not help her, indeed she hadn't ever realized before that "s" stands for two different sounds.
On score (b) we'll get a high percentage, but less than 99.9%. Ie. in a word such as "scienza" (science) it's by no means clear why we need an 'i', which is unneeded and is not pronounced by almost anybody (there are historical reasons, of course). And I vouch for a spelling reform of 'scuola' and 'cuore' (school and heart) to 'squola' and 'quore', which is how they're actually pronounced by almost everybody, with [wɔ] instead of the diphtong [u.o]. Teachers in primary school would live a better life (not to speak of the kids!).
(13-01-2026, 10:48 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view."s" too represents two different sounds ([s,z])
But isn't there an algorithm that tells which is which, at least most of the time? Like, single "s" sounds like [z] between vowels, like [ʃ] when combined with "ci" or "ce", like [s] otherwise?
Quote:and so does "z" ([ts,dz])
Isn't it [ts] only when doubled? (Did I tell you about the import of "pizza" into Portuguese?)
As I told you before, the language we spoke at home was Venetian, and I learned Italian mostly by reading. So for most of my life I had the wrong stress on several words, like "ubriaco" and "eravamo"...
All the best, --stolfi
I suspect most languages do better on (a) (can guess how to pronounce) than (b) (can guess how to spell).
Even when the rules are clear (as in Dutch with verb conjugations ending with -d, -t or -dt) many native speakers have trouble with spelling.
English of course is notoriously bad on (a) as well as (b).
Thai has to be the worst when it comes to (b).
These two words sound exactly the same: ทำ ธรรม
(14-01-2026, 01:16 AM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.I suspect most languages do better on (a) (can guess how to pronounce) than (b) (can guess how to spell).
What about German? For (a) there is the "ch" ambiguity. Is "ss"/"ß" a problem for (b)?
All the best, --stolfi
(14-01-2026, 01:44 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.For (a) there is the "ch" ambiguity.
Yes, /ç/ and /z/ : You are not allowed to view links.
Register or
Login to view.
For other special combination of letters like "ei", "sch", "tz", "äu", I don't know if there is any reading ambiguity.
/ç/ can be written ch as in "mich" or g as in "schmutzig" so it has also a writing ambiguity problem (b). Then there is "v" and "f" also (b)...
(14-01-2026, 01:44 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view.What about German? For (a) there is the "ch" ambiguity. Is "ss"/"ß" a problem for (b)?
There are certainly ambiguities, and they depend on the local variety / dialect.
And then in Switzerland everything is different. (No ß for example).
While German uses "au" for the sound in "How now brown cow", Dutch uses any of "ou", "ouw" or "auw".
"Kou", "kouw" and "kauw" all exist, sound the same, and mean something different.
German "kau" means the same as Dutch "kouw".
But we have gone way OT by now....
(14-01-2026, 12:20 AM)Jorge_Stolfi Wrote: You are not allowed to view links. Register or Login to view. (13-01-2026, 10:48 PM)Mauro Wrote: You are not allowed to view links. Register or Login to view."s" too represents two different sounds ([s,z])
But isn't there an algorithm that tells which is which, at least most of the time? Like, single "s" sounds like [z] between vowels, like [ʃ] when combined with "ci" or "ce", like [s] otherwise?
Quote:and so does "z" ([ts,dz])
Isn't it [ts] only when doubled? (Did I tell you about the import of "pizza" into Portuguese?)
There are some rules but they do not cover all the cases (and nobody knows them anyway). After putting in my software all the [s,z] rules I could find there are still words where [s] cannot be discriminated orthographically from [z]: in a typical text, about 4% of all word tokens. Same for "z", even if doubled, ie. 'razza' (race) is [rattsa] but 'razzo' (rocket) is usually [raddzo].
We do not differentiate much between the two sounds [s,z] (and [ts,dz], [e,ɛ], [o,ɔ], in many cases at least) and so the orthography reflects this, ie. collapsing [s,z] in the single glyph "s". Same for "z", "e" and "o". The problem is rather fundamental because the pronunciation changes from place to place, and I suspect it's always been so. Ie. 'casa' (house) is ['ka.za] in the North but ['ka.sa] in the South, 'neve' (snow) is ['ne.ve] in the North but ['nɛ.ve] in the South. For an Italian it's hard to tell the difference: any Italian who has not studied a little phonetics is completely unaware that there are two "e", [e] and [ɛ]. In the single case when this is relevant (distinguishing 'and' from 'is') what an Italian thinks is that [ɛ] is the stressed form of [e] (sic!). Ask a generic Italian and he'll tell you the language has 5 vowels, and he won't believe you if you tell him they are seven instead.