[arabic-vip] Day 1 proceedings summary

Raed Al-Fayez rfayez at citc.gov.sa
Sun Sep 18 09:59:43 UTC 2011


I agree with Siavash point regarding U+06A0 & U+06A4 & U+06A8 even with Tahoma font (the default in Linux OS for Arabic language) they are confusingly similar: [cid:image002.jpg at 01CC7602.D52FE600]

Note: the letter (075E) may have the same issue.



I suggest that we need to expand this and check the similarity between any character similar to the AIN (ع) with zero or more dots above against any character similar to the FEH (ف) with one or more dots above or just only this case.





For example:



[cid:image004.jpg at 01CC7602.D52FE600]



[cid:image009.jpg at 01CC7602.D52FE600]



..etc





With best regards,



Raed I. Al-Fayez

------------------------------------------

Senior IT Projects Specialist, M.Sc, PMP

Saudi Network Information Center (SaudiNIC)

Communication and Information Technology Commission (CITC)

Tel: + 966-1-2639235   - Fax: + 966-1-2639393

http://www.nic.net.sa





-----Original Message-----
From: arabic-vip-bounces at icann.org [mailto:arabic-vip-bounces at icann.org] On Behalf Of Siavash Shahshahani
Sent: Friday, September 16, 2011 3:20 PM
To: Sarmad Hussain
Cc: arabic-vip at icann.org
Subject: Re: [arabic-vip] Day 1 proceedings summary



Hi Sarmad and All,

It seems to me that we missed one strange case in our variant character

list: The character U+06A0 has medial form 'almost' the same as the medial form of U+06A4 and U+06A8. I say 'almost' because if rendered large enough in deja vu sans the slight difference is noticeable, but in small print used to type in domain names, they're almost certainly mistakeable. BTW, both 06A0 and 06A4 are used in Jawi.

Regards,

Siavash





On Wed, 14 Sep 2011 07:36:40 -0700, "Sarmad Hussain"

<sarmad.hussain at kics.edu.pk<mailto:sarmad.hussain at kics.edu.pk>> wrote:

> Dear All,

>

>

>

> The following is the summary of the salient points from our face to

> face interaction yesterday (details will be captured in the document

> being

> revised):

>

>

>

> A.      General principles:

>

>

>

> 1.       Agreed to talk generally for TLD space, without making the

> distinction between ccTLD or gTLD (and specify where our

> recommendations

or

> comments may diverge)

>

> 2.       Agreed to limit the scope to TLDs (not second or other level

> labels), unless the recommendations apply to all levels (where it

> should

be

> made explicit)

>

> 3.       Though the committee is generally confident on the

> recommendations, some issues may be discussed with representatives of

> languages communities not represented in the committee (e.g. use of

Arabic

> script in African languages)

>

>

>

> B.      The meeting started with the discussion on the characters set

> allowed for TLD, and the following was agreed:

>

>

>

> 1.       Even though there may be some policy to restrict the use of

ZWNJ

> in the TLDs, the committee felt that due to its use in Arabic script,

there

> may be a need for ZWNJ by the community (even though there may be

limited

> use at this time)

>

> 2.       ZWJ is not needed in Arabic script

>

> 3.       0610-061A: an issue as they are PVALID but should not be

allowed

> for TLDs

>

> 4.       0621-063F: OK, PVALID and needed for TLDs

>

> 5.       0641-064A: OK, PVALID and needed for TLDs

>

> 6.       064B-0659: an issue as they are PVALID but should not be

allowed

> for TLDs

>

> 7.       065A-065F: an issue as they are PVALID but should not be

allowed

> for TLDs

>

> 8.       General rule may be extracted that combining marks are not

> allowed for TLDs (but see A.3, regarding combining marks for African

> languages, etc., if they limit the language in question)

>

> 9.       0660-0669: an issue as they are PVALID but should not be

allowed

> for TLDs because digits

>

> 10.   066E-066F: an issue as they are PVALID but should not be allowed

for

> TLDs because Archaic

>

> 11.   0670: an issue as they are PVALID but should not be allowed for

TLDs

>

> 12.   0679-06D3: OK, PVALID and needed for TLDs

>

> 13.   06D5: OK, PVALID and needed for TLDs

>

> 14.   06D6-06DC: an issue as they are PVALID but should not be allowed

for

> TLDs

>

> 15.   06DF-06E8: an issue as they are PVALID but should not be allowed

for

> TLDs

>

> 16.   06EA-06ED: an issue as they are PVALID but should not be allowed

for

> TLDs

>

> 17.   06EE-06EF: OK, PVALID and needed for TLDs

>

> 18.   06F0-06F9: an issue as they are PVALID but should not be allowed

for

> TLDs because digits

>

> 19.   06FA-06FF: OK, PVALID and needed for TLDs

>

> 20.   0750-077F: OK, PVALID and needed for TLDs

>

> 21.   FE73: an issue as they are PVALID but should not be allowed in any

> label (TLDs and other labels)

>

>

>

> C.      The following was discussed regarding variants:

>

>

>

> 1.       There may be four categories: identical, confusingly similar,

> optional and interchangeable.  Refer to tables in the document for the

> following additional observations:

>

> 2.       For identical

>

> a.       Kaf set – limit as one at TLD level; all are possible for TLD

> registration (no preferred over other, depends on registrant request)

>

> b.      Hay set – limit as one at TLD level; all are possible for TLD

> registration (no preferred over other, depends on registrant request)

>

> c.       Yay set - limit as one at TLD level; all are possible for TLD

> registration (no preferred over other, depends on registrant request)

>

> d.      Fay set - limit as one at TLD level; all are possible for TLD

> registration (no preferred over other, depends on registrant request)

>

> e.      Tay marbuta - limit as one at TLD level; all are possible for

TLD

> registration (no preferred over other, depends on registrant request)

>

> f.        Hay hamza - limit as one at TLD level; all are possible for

TLD

> registration (no preferred over other, depends on registrant request)

>

> g.       Theh group - limit as one at TLD level; all are possible for

TLD

> registration (no preferred over other, depends on registrant request)

> (confusable with pay, not Thay)

>

> 3.       For Similar

>

> a.       Kaf set – OK

>

> b.      Yay set – OK

>

> c.       Alif Hamza above set – OK

>

> d.      Alif Hamza below set – OK

>

> e.      Dot orientation: could be variants, so it is an issue but shoud

be

> investigated further with feedback from relevant language communities

(not

> represented on the committee) for further resolution.

>

> 4.       Interchangealble

>

> a.       Alifs (simple, with hamza, with madda): not variants, though

may

> be confusable; issue to be raised

>

> b.      Tay marbuta and hay: not variants, though may be confusable;

issue

> to be raised

>

> 5.       Other

>

> a.       Digits have variants, though not relevant for TLDs

>

> b.      ZWNJ case causes variants in labels with the three characters

> mentioned.  It should not be allowed with these three characters, in

> addition to the existing rule

>

>

>

>

>

>

>

>

>

>

>

> Regards,

> Sarmad

>

>

>

>

>

>

>

> ----

>

> سرمد حسین

>

>

>

> Sarmad Hussain

>

> Professor and Head

>

> Center for Language Engineering (www.cle.org.pk<http://www.cle.org.pk>)

>

> Al-Khawarizmi Institute of Computer Science (www.kics.edu.pk<http://www.kics.edu.pk>)

>

> University of Engineering and Technology (www.uet.edu.pk<http://www.uet.edu.pk>)

>

> Lahore, PAKISTAN

-----------------------------------------------------------------------------------
Disclaimer:
This message and its attachment, if any, are confidential and may contain legally
privileged information. If you are not the intended recipient, please contact the
sender immediately and delete this message and its attachment, if any, from your
system. You should not copy this message or disclose its contents to any other
person or use it for any purpose. Statements and opinions expressed in this e-mail
are those of the sender, and do not necessarily reflect those of the Communications
and Information Technology Commission (CITC). CITC accepts no liability for damage
caused by this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm.icann.org/pipermail/arabic-vip/attachments/20110918/ff107ccb/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 19916 bytes
Desc: image002.jpg
Url : http://mm.icann.org/pipermail/arabic-vip/attachments/20110918/ff107ccb/image002-0001.jpg 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.jpg
Type: image/jpeg
Size: 14312 bytes
Desc: image004.jpg
Url : http://mm.icann.org/pipermail/arabic-vip/attachments/20110918/ff107ccb/image004-0001.jpg 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image009.jpg
Type: image/jpeg
Size: 14974 bytes
Desc: image009.jpg
Url : http://mm.icann.org/pipermail/arabic-vip/attachments/20110918/ff107ccb/image009-0001.jpg 


More information about the arabic-vip mailing list