[Comments-kannada-oriya-telugu-08aug18] A quick review of the Telugu proposal

梁海 Liang Hai lianghai at gmail.com
Mon Oct 8 12:17:05 UTC 2018

- 2, “telɯgɯ”: This is probably a phonetic transcription, not an accurate transliteration that should be used in this document.

- 3.5, “… and 16 dependent signs”: 15.

- 3.5.1: Vocalic l should be categorized with vocalic rr and vocalic ll. Transliteration of vocalic ll is wrong.

- 3.5.1, R1, “ca= a consonant with an inherent ‘a’”: When discussing text encoding, Indic consonants naturally are with an inherent vowel. Try to distinguish phonetic sequence and written forms and encoded character sequence. The 3 lines under R1 are not helpful.

- 3.5.3: The introduction of arasunna usage is unclear. Is it commonly used today or not?

- 4.1: Good to see the usage of ZWNJ to be clearly introduced here.

- 4.2: Unclear how the common case of ZWNJ usage is to be dealt with by domain name applicant. Will the applicant be allowed to use ZWNJ? If not, then it’s not particular clear how it’s decided to forbid ZWNJ, given the strong and unambiguous usage of it.

- 5: It is appropriate to exclude U+0C58 tsa and U+0C59 dza?

- 5.2: Apparently U+0C44 TELUGU VOWEL SIGN VOCALIC RR should be excluded.

- 5.3, Various signs: The description doesn’t make sense. These two characters should be excluded because they’re part of vowel signs that are already atomically encoded and encluded. U+0C55 is not meant for encoding hā (unless it’s decided other irregularly written consonant–vowel structures need to be encoded visually too).

- 5.3, Historic phonetic variants: Unclear why “Phonological variants shall not be permitted”.

- 6, “There are no characters in the Telugu Unicode chart that either in simple form or in combined form are deemed similar by NBGP.”: Should mention the precondition of WLE.

- 6.1: There shouldn’t be disposition of “blocked” in the table (or anywhere), because it’s not even proposed to be variants. The example i and ii as well as the table are all very confusing. Vowel sign o and oo need to be discussed separately   as they have different behavior. See [https://www.unicode.org/L2/L2014/14005-telugu-kannada-vs-o-oo--UTN.pdf](https://www.unicode.org/L2/L2014/14005-telugu-kannada-vs-o-oo--UTN.pdf) for a better introduction. Also, హా should be introduced in this section too, although it’s not proposed either (because of excluded character).

- 6.2: Inappropriate restriction. This is like restricting one between “colour” and “color” because they’re alternative spellings. “This can be disallowed by the WLE rule: H cannot follow a nasal consonant.” — Inappropriate rule, as the authors didn’t even consider geminated nasal consonants (eg, in కన్నడ kannaḍa).

- 6.4.1: Inappropriate analysis. Confusbale standalone letters don’t necessarily mean confusable contextual forms (eg, vattu forms can have different ascending behavior and different positioning behavior and different letterforms). Also it’s unclear if the authors have examined contextual forms independently from the similarity of standalone letters. Further, many contextual forms are expectable in a small context (eg, one can tell a vattu exists in well-formed text as long as a consonant is preceded by virama), thus it’s not necessary to enumerate akshars and overload the variant set. (Table 16 is a duplication of Table 10.)

- 6.4.2, “NBGP concludes …”: Given the weak analysis above, it’s hard to believe what NBGP concludes now.

- 7: A comprehensible pattern for other reviewers to refer to: `C[M][B|X] | V[B|X] | CH` (consonant clusters analyzed as a consonat preceded by one or more `CH` occurences).

- 7, Rule 5: Inappropriate and over-restrictive rule. See my comment above for §6.2.

- 7, Rule 6: Unecessarily restrictive rule. “… perceptually dissimilar but phonetically and semantically similarity between the two labels” is enough for allowing such usage. NBGP doesn’t have the right to force the public to abandon preferred spelling conventions.

