[Neobrahmigp] Inputs from the Devanagari Nepali and Newar Languages for the Devanagari LGR report

Akshat Joshi akshatj at cdac.in
Mon Aug 7 08:16:33 UTC 2017


Hello Dr. Bal and our Nepal team,

Thanks for putting together the document for Nepali and Newar languages 
for Devanagari script.

Following are my observations:

1. Code point repertoire for the Devanagari Nepali and Newar languages

     - As I see, the currently shared Devanagari LGR takes into account 
all the characters required by Nepali and Newar. Please let me know if 
there is some discrepancy related to the same. Also, it would be great 
if you could cite some additional references in the last column 
"References" which depict use of the individual characters in everyday 
use. They could be different references for different characters.

     - Also, as required, the 0931 (Ra Nukta) has been contextually 
permitted to form eyelash reph only.

2. Composite characters – Confusingly similar shapes

     - The similar looking cases which are mere confusions on part of 
the user may not form part of the Devanagari variants. As per the LGR 
procedure, these cases are subject to "String Similarity Assessment" 
panel. As discussed in the Kathmandu meeting, I can definitely put them 
in Appendix with a reference for the string similarity panel to take 
them into account as an official recommendation from the NBGP.

     Having said the above, I myself have contradicted to the above by 
including some of the "confusingly similar" cases as a part of variant 
recommendations for LGR. These are the cases pertaining to Santhali 
combinations where Nukta is expected to come with certain Vowels and 
Vowel signs. These are Unique cases because the "non-Santhali user-base 
of Devanagari" (which is major part of it) may not at all imagine 
presence of Nukta at those locations. Such instances may thus be 
construed by them as Stylistic variants/rendering problems thereby not 
making them sound an alarm. The point being, these are not mere visual 
similarity cases as they involve a congnitive lapse. This makes them 
worth being explicitly cited as variant.

     - Regarding similarity based on fonts:

     I would request to refer to our discussion as per mail on 28th July 
'17 on the topic.

     - Regarding similarity between र +  ्  + इ and ई:

     This is already being barred by our context rules which are based 
on earlier work done by C-DAC for .bharat domain names.

3. Homophonic variants

     - As rightly pointed out in your document, these rules may not 
uniformly apply across the board to all the languages using Devanagari. 
Most of the suggestions under this section fall under the spelling norms 
which is not what we are aiming through LGR creation. An example for the 
same in English is e.g. No three consonants can come together to form a 
meaningful word, however fli*ckr* is still a domain name widely accepted 
and used by the Internet Community. In the same spirit, we will restrict 
ourselves from going in the "spelling norms". Also, not all spelling 
norms are algorithmically predictable and vary a great deal across the 
community.

     As far as variant aspect of such words is concerned, there are two 
things about it.

     - As per classical approach of domain name system, such cases are 
not treated as variants as their appearance is completely different. 
e.g. color vs colour.

     - Also, even though it may appear that such cases can be 
algorithmically predicted going by the varga classification in Brahmi, 
across linguistic communities, these cases differ. The varga 
classification and it's last nasal consonant is perfect system in itself 
for predicting nasalization and conjuct behaviors in words,  however it 
is not how it has come down into popular usage across the communities. 
The point being, it cannot be algorithmically predicted which is basic 
requirement under the LGR procedure.

     Regarding Halant ending words:

     This we can accommodate as the ending halant in many cases is not 
clearly visible. Just like Santhali variant cases, these can be missed 
by users by not expecting them to be. Request all for a feedback on the 
same.


These are my views. Please feel free to discuss further on these points.

Regards,

Akshat Joshi



On 03-08-2017 12:06, Bal Krishna Bal wrote:
> Hello Akshat and All,
> Please find attached the inputs from the Devanagari Nepali and Newar 
> Languages for the Devanagari LGR Report.
> Regards,
> Bal Krishna
>

-- 
Regards,
Akshat Joshi
C-DAC GIST


-------------------------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/neobrahmigp/attachments/20170807/c48450a3/attachment.html>


More information about the Neobrahmigp mailing list