[Comments-kannada-oriya-telugu-08aug18] A quick review of the Odia proposal

Mon Oct 8 10:38:03 UTC 2018

- 2, “oḍiā”: Not an accurate transliteration as it doesn’t distinguish ଓଡିଆ and ଓଡ଼ିଆ.

- 3, “known in Unicode as Oriya”: It’s known also as “Oriya” everywhere.

- 3, “Oriya script seems to be a variant of Devanāgarī … mahājani (trader's) script.”: Seriously? Is this copied from the Gujarati proposal? Odia doesn’t seem to be a variant of Devanagari to anyone who knows the existence of the Bangla script.

- 3.1, “The diagram belowshows the major stages in the evolution of Oriya attesting its late divergence from Devanāgarī.”: Are the authors trying to ignore/deny Odia’s relationship with Bangla?

- 3.4: Why the IPA of ଯ is missing? The whole set of IPA transcriptions is apparently inaccurate as it doesn’t reflect even some of the Odia language’s typical features (eg, the rounded schwa). Actually the proposal doesn’t need IPA transcriptions for every letter because it’s text encoding being discussed. Lossless transliterations are much more useful. The whole Table 1 can be removed.

- 3.6: Authors are not using accurate transliterations.

- 3.7, “Half form of consonants (pre-base form)”: Pre-base forms don’t seem relevant to Odia discussions.

- 3.8, “… to show that words having these consonants with a nukta are to be pronounced in the Perso-Arabic style.”: Inaccurate. At least the usage of nukta on dda and ddha is not related to Perso-Arabic words.

- 3.10, “/ãala/”: Either use a phonetic transcription (then the first syllable’s vowel is probably not /a/ and the consonant is not /l/), or use transliteration: am̐ḷā. It’s not helpful and is only confusing if an inaccurate transcription/transliteration is provide. Drop it or correct it.

- 3.11: Rendering failure of the second example (saṁkhyā).

- 3.12, Table 3: Clean up the duplicated dotted circles. Why is vocalic rr excluded but vocalic l adn vocalic ll are included? Be consistent with the discussions in later sections.

- 5.2 and 5.2.1: Doesn’t U+0B35 ORIYA LETTER VA fall into “4.1.2.4 No Rare and Obsolete Characters“? Why is U+0B57 ORIYA AU LENGTH MARK excluded but U+0B56 ORIYA AI LENGTH MARK is included?

- 6.1, “… there are no characters/character sequences which can be created by using the Oriya characters permitted as per the [MSR] and look identical.”: **FALSE**. Odia has a seriously problem of confusables because of multiple ways of encoding the signs of ba and ya. Many fonts (eg, Nirmala UI) allow both <virama, U+0B2F ya> and <virama, U+0B5F yya> to form the post-base form of ya; and allow all of <virama, U+0B2C ba>, <virama, U+0B35 va>, and <virama, U+0B71 wa> to form the blow-base forms of ba. To the very least, this is the problem the propasal should’ve captured, and the NBGP failed. And these variants probably need to be proposed as “allocatable”. Also, Odia does have other natural (not because of technical issues like the problem aforementioned) ambiguities that need to be addressed (note many of them are stylistic and depend on what font is used to render text), see [https://en.wikipedia.org/wiki/Odia_alphabet#Ambiguities](https://en.wikipedia.org/wiki/Odia_alphabet#Ambiguities)

- 7: For other reviewers’ reference: `C[N][M][B|D|X] | V[B|D|X] | C[N]H`

Best,
梁海 Liang Hai
https://lianghai.github.io

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/comments-kannada-oriya-telugu-08aug18/attachments/20181008/eb0c87e2/attachment.html>