[arabic-vip] Review of Arabic Script Definitions - 22Sep11

Manal Ismail manal at tra.gov.eg
Sun Sep 25 12:40:32 UTC 2011


Dear Behnam ..
 
Thank you for your comments ..
I've tried to reflect them in the attached file which is based on the ;ast clean version circulated .. I won't accept changes till we hear more input on some of those changes as in some cases they revert other suggestions ..
Please find my comments inline below in Blue ..
For the sake of time, it would be great if further comments are reflected directly into the document using track changes as I'll be on my way to Nairobi for the IGF and might not be able to have Internet access before tomorrow .. ..
 
Kind Regards
 
--Manal

________________________________

From: behnam at gmail.com on behalf of Behnam Esfahbod
Sent: Sun 25/09/2011 02:45 AM
To: Manal Ismail
Cc: arabic-vip at icann.org
Subject: Review of Arabic Script Definitions - 22Sep11



Dear Manal, all,

Following please find some points on the definitions document.

1. Joining and Non-Joining Letters

I think the definitions for Joining and Non-Joining letters
are not accurate enough. In fact, we should use Unicode's
definitions for these properties. In Unicode book, Chapter
8, (table 8-3, page 248 of latest edition) the following
categories are defined, based on Joining_Type character
property.

thanks for the info, we definitely want to use the unicode definitions .. pls confirm that the attcahed reflects what you meant .. and that the reference at the footnote is accurate ..

1.1. Non-Joining Characters: Those characters that do not
connect to letters before or after them; i.e. U+0621 LETTER
HAMZA, U+0674 HIGH HAMZA, and U+200C ZWNJ.

1.2. Right-Joining Characters: Those characters that connect
to the letter before them; i.e. all letters based on Alef,
Reh, Dal, and Waw, and a few other letters.

1.3. Dual-Joining Characters: Those characters that connect
to the letters before and after them; i.e. all other Arabic
letters.

1.4. Join-Causing Characters: Those characters that connect
to the letters before and after them, but do not change
shape themselves; i.e. only U+200D ZWJ and U+0640 TATWEEL.

With respect to those categories, we can have the following
definitions:

1.5. Non-Joining Letters: The group of characters in 1.1
which are letters (by Unicode's definition); i.e. U+0621
LETTER HAMZA and U+0674 HIGH HAMZA.

1.6. Right-Joining Letters: The group of characters in 1.2
which are letters; i.e. all letters based on Alef, Reh, Dal,
and Waw, and a few other letters.

1.7. Dual-Joining Letters: The group of characters in 1.3
which are letters; i.e. all other Arabic letters.

2. Ligature

Unicode's definition for "Ligature" says "a combination of
two or more characters". Why we are saying "one or more
Arabic Letters"?

If the idea is to simplify the definition for the Arabic
script, I don't see why we should use "one or more letters"
instead of "two or more letters".

If we are talking about any Arabic ligature in our report
that is made of only one Arabic letter and some other
combining marks, could you please point that out?


the definition originally read "two or more" but was changed by Sarmad to "one or more" .. I'll delay finalizing this till Sarmad is able to clarify his point ..

3. Forms of a Letter

3.1. In this section, the word "ligature" is misused in the
definitions for the four shaping forms. What you meant here
is "the group of letters that are joined together", which is
not the definition of "ligature". What we have been using in
technical context for this concept is "joining run". I
strongly recommend we agree on term for this concept before
we deliver our report. Anyone has any other term in mind we
can use here?

I'll wait for feedback here ..

3.2. In the definitions of "Initial form" and "Medial form",
instead of "joining letter" it should be "right-joining
letter".

done

3.3. In the definition of "Final form", "It is the form of
a *joining* letter" is correct. You have missed the
"joining" part.

done

3.4. Because of 3.3, we should first define "Joining Letter"
in the section "Joining and Non-Joining Letters" as the
union of "Right-Joining Letters" and "Dual-Joining Letters".

is this meant to replace the existing one .. please note that I have also deleted the footnote .. Is this in addition to the definitions suggested in (1) above .. please confirm that the attached reflect what you meant .. 


4. Writing Style

4.1. I think "both use the Arabic script" should be replace
by something like "both are different styles of writing one
script, the Arabic script".

I just tried to stick to definitions that exist to the extent possible .. and this is a 'copy' & 'paste' of the definition of the term from http://www.rfc-editor.org/rfc/rfc6365.txt <http://www.rfc-editor.org/rfc/rfc6365.txt> .. I'll await your confirmation to edit the definition as suggested and accordingly change the reference in the footnote to read "motivated from http://www.rfc-editor.org/rfc/rfc6365.txt <http://www.rfc-editor.org/rfc/rfc6365.txt>  " just to reflect that it is edited ..  Awaiting your confirmation ..    

5. Label Valid Character

5.1. Would you please help me understand this. We have "A
Label Valid Character is represented by a sequence of one or
more Label Valid Code Points." What do you mean by
more-than-one code-points for a character?


I got your point and agree with your remark .. I also could not trace back how we came up to this definition .. so I'll delete this last sentence, unless it was intended to cater for a combination of more than one code point resulting in a valid character (and frankly I'm not sure if this is technically accurate) or unless someone else have a better explanation ..

Thanks all for all the efforts on writing this document.
-Behnam


-------------- next part --------------
A non-text attachment was scrubbed...
Name: Arabic Script Definitions - 25Sep11 w TC.doc
Type: application/msword
Size: 81920 bytes
Desc: Arabic Script Definitions - 25Sep11 w TC.doc
Url : http://mm.icann.org/pipermail/arabic-vip/attachments/20110925/4c5b7e79/ArabicScriptDefinitions-25Sep11wTC-0001.doc 


More information about the arabic-vip mailing list