[arabic-vip] Digit mixing

ajs at anvilwalrusden.com ajs at anvilwalrusden.com
Wed Aug 24 14:22:06 UTC 2011


Re-sending from my proper address.   Apologies for any duplicates.

On Wed, Aug 24, 2011 at 11:05:52AM +0430, Alireza Saleh wrote:
> Dr. Shahshahani told me there was a discussion in the telephone conference on why Western Arabic digits cannot be mixed with Eastern Arabic and ASCII digits.Here is the answer: 
> 
> The digit mixing is not allowed in IDNA2008 because of the BIDI properties of the three well known digit sets:
> 
> ASCII and Eastern Arabic digits are EN(European Number), Western Arabic digits are AN(Arabic Number), so mixing AN and EN properties will cause visual confusions within a label in some cases. Look at the attached example, I have entered the same sequence of characters for both labels but I have mixed two digit-sets in the first one and you can see the Eastern Arabic digit 6 jumps over the hyphen. 
> 
> first example:
> network order: [U+05E2, U+06F4, U+0665, U+002D, U+06F6]
> Visual order: [U+06F6, U+002D, U+06F4, U+0665, U+05E2]
> 
> Second Example:
> Network order: [U+05E2, U+06F4, U+06F5, U+002D, U+06F6]
> Visual order: [U+06F4, U+06F5, U+002D, U+06F6, U+05E2]
> 
> and because it creates unstable labels the digit mixing is disallowed in IDNA2008.

That is part of the reasoning.

Recall that there are two tests for code points under IDNA2008.
First, there is the "tables" test: does the code point meet the rules
set out in RFC 5892?  Those rules make all these digits (0660..0669
and 06F0..06F9) CONTEXTO.  The context rule registry explicitly
disallows the mixing of these sets (see RFC 5892 section 2.6 and
appendix A.8 and A.9).

On top of that, the code points need to pass the bidi rule. 

Note that the bidi rule (and indeed the protocol rule above) _does
not_ apply across labels.  So there is no inter-label test (for
reasons having to do with how the DNS protocol works -- I can explain
in detail if anyone cares).  This means that, while the labels Alireza
outlines above are not allowed, it is not possible to restrict the
possibility of having a name like [some
characters][U+0663].[U+06F7][some characters].  This will still cause
display problems (and possibly confusion problems.)

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com


More information about the arabic-vip mailing list