[arabic-vip] Day 1 proceedings summary

Siavash Shahshahani shahshah at irnic.ir
Fri Sep 16 12:19:52 UTC 2011


Hi Sarmad and All,
It seems to me that we missed one strange case in our variant character
list: The character U+06A0 has medial form 'almost' the same as the medial
form of U+06A4 and U+06A8. I say 'almost' because if rendered large enough
in deja vu sans the slight difference is noticeable, but in small print
used to type in domain names, they're almost certainly mistakeable. BTW,
both 06A0 and 06A4 are used in Jawi.
Regards,
Siavash


On Wed, 14 Sep 2011 07:36:40 -0700, "Sarmad Hussain"
<sarmad.hussain at kics.edu.pk> wrote:
> Dear All,
> 
>  
> 
> The following is the summary of the salient points from our face to face
> interaction yesterday (details will be captured in the document being
> revised):
> 
>  
> 
> A.      General principles:
> 
>  
> 
> 1.       Agreed to talk generally for TLD space, without making the
> distinction between ccTLD or gTLD (and specify where our recommendations
or
> comments may diverge)
> 
> 2.       Agreed to limit the scope to TLDs (not second or other level
> labels), unless the recommendations apply to all levels (where it should
be
> made explicit)
> 
> 3.       Though the committee is generally confident on the
> recommendations, some issues may be discussed with representatives of
> languages communities not represented in the committee (e.g. use of
Arabic
> script in African languages)
> 
>  
> 
> B.      The meeting started with the discussion on the characters set
> allowed for TLD, and the following was agreed:
> 
>  
> 
> 1.       Even though there may be some policy to restrict the use of
ZWNJ
> in the TLDs, the committee felt that due to its use in Arabic script,
there
> may be a need for ZWNJ by the community (even though there may be
limited
> use at this time)
> 
> 2.       ZWJ is not needed in Arabic script
> 
> 3.       0610-061A: an issue as they are PVALID but should not be
allowed
> for TLDs
> 
> 4.       0621-063F: OK, PVALID and needed for TLDs
> 
> 5.       0641-064A: OK, PVALID and needed for TLDs
> 
> 6.       064B-0659: an issue as they are PVALID but should not be
allowed
> for TLDs
> 
> 7.       065A-065F: an issue as they are PVALID but should not be
allowed
> for TLDs
> 
> 8.       General rule may be extracted that combining marks are not
> allowed for TLDs (but see A.3, regarding combining marks for African
> languages, etc., if they limit the language in question)
> 
> 9.       0660-0669: an issue as they are PVALID but should not be
allowed
> for TLDs because digits
> 
> 10.   066E-066F: an issue as they are PVALID but should not be allowed
for
> TLDs because Archaic
> 
> 11.   0670: an issue as they are PVALID but should not be allowed for
TLDs
> 
> 12.   0679-06D3: OK, PVALID and needed for TLDs
> 
> 13.   06D5: OK, PVALID and needed for TLDs
> 
> 14.   06D6-06DC: an issue as they are PVALID but should not be allowed
for
> TLDs
> 
> 15.   06DF-06E8: an issue as they are PVALID but should not be allowed
for
> TLDs
> 
> 16.   06EA-06ED: an issue as they are PVALID but should not be allowed
for
> TLDs
> 
> 17.   06EE-06EF: OK, PVALID and needed for TLDs
> 
> 18.   06F0-06F9: an issue as they are PVALID but should not be allowed
for
> TLDs because digits
> 
> 19.   06FA-06FF: OK, PVALID and needed for TLDs
> 
> 20.   0750-077F: OK, PVALID and needed for TLDs
> 
> 21.   FE73: an issue as they are PVALID but should not be allowed in any
> label (TLDs and other labels)
> 
>  
> 
> C.      The following was discussed regarding variants:
> 
>  
> 
> 1.       There may be four categories: identical, confusingly similar,
> optional and interchangeable.  Refer to tables in the document for the
> following additional observations:
> 
> 2.       For identical
> 
> a.       Kaf set – limit as one at TLD level; all are possible for TLD
> registration (no preferred over other, depends on registrant request)
> 
> b.      Hay set – limit as one at TLD level; all are possible for TLD
> registration (no preferred over other, depends on registrant request)
> 
> c.       Yay set - limit as one at TLD level; all are possible for TLD
> registration (no preferred over other, depends on registrant request)
> 
> d.      Fay set - limit as one at TLD level; all are possible for TLD
> registration (no preferred over other, depends on registrant request)
> 
> e.      Tay marbuta - limit as one at TLD level; all are possible for
TLD
> registration (no preferred over other, depends on registrant request)
> 
> f.        Hay hamza - limit as one at TLD level; all are possible for
TLD
> registration (no preferred over other, depends on registrant request)
> 
> g.       Theh group - limit as one at TLD level; all are possible for
TLD
> registration (no preferred over other, depends on registrant request)
> (confusable with pay, not Thay)
> 
> 3.       For Similar
> 
> a.       Kaf set – OK 
> 
> b.      Yay set – OK
> 
> c.       Alif Hamza above set – OK
> 
> d.      Alif Hamza below set – OK
> 
> e.      Dot orientation: could be variants, so it is an issue but shoud
be
> investigated further with feedback from relevant language communities
(not
> represented on the committee) for further resolution.  
> 
> 4.       Interchangealble
> 
> a.       Alifs (simple, with hamza, with madda): not variants, though
may
> be confusable; issue to be raised
> 
> b.      Tay marbuta and hay: not variants, though may be confusable;
issue
> to be raised
> 
> 5.       Other
> 
> a.       Digits have variants, though not relevant for TLDs
> 
> b.      ZWNJ case causes variants in labels with the three characters
> mentioned.  It should not be allowed with these three characters, in
> addition to the existing rule
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> Regards,
> Sarmad
> 
>  
> 
>  
> 
>  
> 
> ----
> 
> سرمد حسین 
> 
>  
> 
> Sarmad Hussain 
> 
> Professor and Head
> 
> Center for Language Engineering (www.cle.org.pk)
> 
> Al-Khawarizmi Institute of Computer Science (www.kics.edu.pk)
> 
> University of Engineering and Technology (www.uet.edu.pk)
> 
> Lahore, PAKISTAN


More information about the arabic-vip mailing list