From lianghai at gmail.com Mon Oct 8 05:18:30 2018 From: lianghai at gmail.com (=?utf-8?B?5qKB5rW3IExpYW5nIEhhaQ==?=) Date: Mon, 8 Oct 2018 13:18:30 +0800 Subject: [Comments-devanagari-gurmukhi-gujarati-scripts-lgr-27jul18] A quick review of the Devanagari proposal Message-ID: - ?2, ?Latin transliteration of native script name: d?van?gar??: Use a consistent transliteration scheme throughout the document. - ?3.3.1, footnote 5: ?/a/ would be misunderstood? only because the authors don?t try to use consistent transliterations. - ?3.3.2, ?However, the notion of maximum number of consonants joining to form one akshar is empirical?: Good. Such sensible statements are rarely seen. - ?3.3.3, Table 5: The vowel set seems sketchy. It doesn?t make sense to include letter and sign of vocalic rr but exlude vocalic l and ll. It doesn?t make sense to include letters and signs of oe, ooe, aw, ue, and uue (presumbly all for Kashmiri), but exlude short e and short o (which are also required by Kashmiri). - ?3.3.4: A typical confusion between the grapheme bindi and the phoneme anusvara (note the grapheme bindu/anusvara often represents a phonetic nasalization/anunasika in Hindi, but is encoded as bindu) when trying to introduce seemingly-well-understood orthography but not understanding the context of discussing text encoding. Over-emphasis of certain languages and writing systems? orthography features. In this document?s concern, bindu/anusvara is just a sign representing certain nasal feature. - ?3.3.6, ?? to represent sounds found only in words borrowed from Perso-Arabic?: Not true. Nukta is used for sounds (including languages? native sounds, including loanword sounds from Perso-Arabic, English, etc, origins) that can?t be represented by the original set of graphemes in Devanagari. If the authors can?t figure out a good summary for a section at the beginning, the section should start with an introductory sentence ?Something has following functions:? then. - ?3.3.6, ???? /b?dh/?: Use a decent transliteration or phonetic transcription. - ?3.3.8, ?Earlier the ZWJ was recommended ? However, with the new recommendations in place, this usage of ZWJ is now not encouraged.?: Unclear where this observation comes from. The Unicode Standard Core Specification currently doesn?t state a preference between the two encodings. - ?4.1.2.4: Make ?3.3.3, Table 5 consistent with this consideration and ?5.2. Authors seem to have a hard time figuring out how to deal with the duplicated information between ?3.3 and ?4/?5. I suggest ?3.3 should only include encoding-ignorant information. - ?5.2, Table 6: Should note the ?Indic syllabic category? column is not about the Unicode character property of the same name. - ?5.2, Table 6, row 67: Wrong glyph and name. - ?5.5, ?? in the form of variables?: These are not variables but notation. - ?6, ?There are no characters/character sequences in Devanagari which can be created by using the characters permitted as per the [MSR] and that look exactly alike.?: Not true. First, WLE is also required to prevent confusables (eg, vowel letter aa vs ). Also, even with the WLE, the case of anusvara following a candra shape (part of vowel letters candra e, candra a, and candra o, as well as vowel signs cadra e and cadra o) should be examined, eg, Marathi ??? (bank) and Hindi ???? ???? (Hong Kong) can be encoded with either candrabindu or and rendered the same in major fonts (and actually the latter encoding might be semantically preferred by many users, thus might even lead to a ?allocatable? disposition). - ?6.1, Table 16: Glyphs should be manually drawn to better illustrate the proper rendering. - ?6.4: Just a feeling, the disposition of ?blocked? might be too restrictive. - ?6.5, Table 19: Variants between Devanagari and Bengali don?t seem even close to being as complete as the Gurmukhi ones. Where is Bengali candrabindu, nukta, vowel sign aa, vowel sign ii, vowel sign u, virama, and certain consonant letters? - ?7: A comprehensible pattern for other reviewers? reference: `C[N][M[N]][B|D|X] | V[N][B|D|X] | C[N]H` - ?7, Case of Eyelash Reph: Unclear what the reason 2 means. - ?7, Case of V preceded by H: This is too restrictive. Best, ?? Liang Hai https://lianghai.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From lianghai at gmail.com Mon Oct 8 08:09:03 2018 From: lianghai at gmail.com (=?utf-8?B?5qKB5rW3IExpYW5nIEhhaQ==?=) Date: Mon, 8 Oct 2018 16:09:03 +0800 Subject: [Comments-devanagari-gurmukhi-gujarati-scripts-lgr-27jul18] A quick review of the Gurmukhi proposal Message-ID: - ?3, ?? but it has now been established, on the basis of its name, that the Indians did have a system of writing which must have been borrowed freely from local script.?: How?s this (and the following two paragraphs, and the whole ?3.1) even relevant to the LGR proposal? Authors shall look for a proper place to publish their history research. - ?3.3, ?? ligatures are formed only with following /h, r and v/ consonants.?: Has the well-known post-base form of ya already fell out of use in common text? Probably should mention this. - ?3.3.2, ?Unlike Devanagari, Gurmukhi consonants are also used to represent consonant sounds where / ? / is not included in them.?: Both Hindi and Punjabi?Gurmukhi orthographies allow implicit dead consonants. It?s just Punjabi?Gurmukhi allows more. This level of spelling and reading rules are not really relevant to the proposal. An encoded pure killer (virama/halant) is only used when the mark or its conjunct-forming effect visually exists. - ?3.3.2, ?In Gurmukhi, virama ??? (U+0A4D) is used in place of halant "?" (U+094D)?: This sentence only brings confusion. U+094D as a Devanagari-specific character has nothing to do with Gurmukhi. Are the authors going to clarify such relationship between other cognate graphemes too? - ?3.3.2, ?In Gurmukhi, virama is not used with any consonant that represents only the consonant sound instead of consonant plus vowel sound?: Rewrite to ?The grapheme of virama is not used in Punjabi text to strip a consonant letter?s implicit vowel.? - ?3.3.4: ?Suprasegmental? is not an appropriate term here, since at least gemination is segmental. Also, according to ?3.3.4.1 and ?3.3.4.2, the nasality is not pure nasalization of vowels but is segmental nasal consonants also. - ?3.3.4.2, rule 1: The detailed phonetic spelling logic (eg, ?? the forms of u, uu vowels after any other vowel ??) is not really relevant to text encoding. - ?3.3.4.3, ?In these letters, NGA (?) and NYA (?) are nasal consonants so these are stressed or doubled by the nasal sign tippi.?: Suspicious explanation. What about na and ma then? - ?3.3.4.4, ?But in Gurmukhi, these letters can also be written as a single unit ??: There?s a difference between writing and encoding. - ?3.3.5, ?Some of the character combinations ? are encoded using ZWJ and ZWNJ.?: How are multiple-vowel-sign clusters encoded using ZWJ/ZWNJ? - ?4.1.3: Visarga is used for marking abbreviations according to ?3.3.4.5. Need to clarify this either in this section or in ?4.1.3. - ?4.1.6, ?These characters can occur as single character words, but in TLD, single character labels are not allowed, so these letters will not be added.?: Should introduce and better discuss the usage of them in ?single character words?, as those words can presumably appear in multi-word labels too. - ?4.1.6: Also, since a/aira is also a vowel carrier, the section needs to be worded more accurately. - ?5.3, ?It is very easy for a native language speaker to count the number of syllables in a sequence?: Don?t exaggerate. The split of phonetic syllables and orthographic syllales in Indic scripts makes it often confusing for native users to count a certain type of syllables. - ?5.3, ?The definition is a combination of 2 rules?: Similar streamlined rules/patterns should be included in other scripts? corresponding sections in their LGR proposals. Also, the ?{CH}? part in the pattern is worth considering by authors of the other proposals. - ?5.3, 3rd table, row 2, ?Zero or one Consonant + Virama/Addak sequence followed by consonant is a syllable?: `CA` is a preceding orthographic syllable and is not relevant to this rule. The rule above the table is not even consistent with the original introduction. - ?5.3, ?Examples of combination of the rules?, ?2. ?????? (parind?)?: The authors keep mixing up phonetic strucutres and written structures. There?s no V (already defined as independent vowel letters) in this word. It?s CCMDCM. Same problem in ?3. ???? (andar)?: it is VDCC, what are ?Vm? and ?CvC??! - ?7: A comprehensible pattern for other reviewers to consider: `[ C[N]{HC}[M] | V ] [A|B|D]` - ?7.6: Probably too restrictive as this is about spelling conventions (note ? and ? are already special cases, and there can be more). It?s not future-proof to limit the usage when there?re no confusability issues. Best, ?? Liang Hai https://lianghai.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From lianghai at gmail.com Mon Oct 8 09:02:47 2018 From: lianghai at gmail.com (=?utf-8?B?5qKB5rW3IExpYW5nIEhhaQ==?=) Date: Mon, 8 Oct 2018 17:02:47 +0800 Subject: [Comments-devanagari-gurmukhi-gujarati-scripts-lgr-27jul18] A quick review of the Gujarati proposal Message-ID: <52F43128-46E5-4044-929F-546CCBB2219B@gmail.com> - 2, ?gujar?t??: Use a consistent transliteration scheme throughout the document. - 3.4.4: The spelling alternation is not relevant. Both functions are representation of a nasal sound. - 5.2: Why are U+0A8C GUJARATI LETTER VOCALIC L and U+0AC4 GUJARATI VOWEL SIGN VOCALIC RR included? Don?t they belong to the same category of excluded letters vocalic rr and vocalic ll? - 5.5: It?s actually just as simple as: `C[N][M][B|X] | V[B|X] | C[N]H` (consonant clusters can be broken down to multiple preceding occurences of `C[N]H`, when the exactly rendering of a cluster is not the discussion?s concern. - 6, ?There are no characters/character sequences in Gujarati, which can be created by using the characters permitted as per the [MSR] and look exactly alike.?: Should be MSR and WLE (which restricts the cluster structure, preventing sequences like `VM`). Best, ?? Liang Hai https://lianghai.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: