[arabic-vip] WHOIS related query

Fahd Batayneh Fahd.Batayneh at NITC.gov.jo
Thu Aug 18 07:10:16 UTC 2011


At a ccTLD level, what you said is right Dr. Siavash, and it is the most feasible option. However, at a gTLD level where multiple languages exist, I think option number 1 is the best.

 [cid:image001.png at 01CB3491.EE8CD3A0]
National Information Technology Center

Fahd A. Batayneh

Team Lead
National Domain Names Division
Data and Network Security Department

P.O.Box: 259  ▪  Amman 11941  ▪  Jordan
Tel: 962.6.5300225
Fax: 962.6.5300277
E-Mail: fahd.batayneh at nitc.gov.jo<mailto:fahd.batayneh at nitc.gov.jo>


-- Follow NITC on Twitter<http://twitter.com/jordannitc>
Register your Arabic Domain Name under .alordun. For more information, please visit our website http://www.idn.jo/ or http://نطاقات-عربية.الاردن/

Disclaimer
The message contained in this e-mail along with the attachments (if present) are meant for the use of the intended recipient only. If you are not the intended recipient, please notify the sender immediately. Any unauthorized disclosure, copying, distribution of or taking any action in reliance on the contents of the information contained herein is strictly prohibited
• Please consider the environment - Do you really need to print this e-mail?


-----Original Message-----
From: arabic-vip-bounces at icann.org [mailto:arabic-vip-bounces at icann.org] On Behalf Of Siavash Shahshahani
Sent: Thursday, August 18, 2011 6:59 AM
To: Manal Ismail
Cc: arabic-vip at icann.org
Subject: Re: [arabic-vip] WHOIS related query

Option 2 is what many ccTLDs priactically do, i.e., limit themselves to part of the full table. This makes sense for a ccTLD as they are concerned with a limited community. I don't think 'we' should adopt a universal solution; it is the job of each registry to cope with this problem for itself according to the nature of the TLD it operates. Further note that this becomes a problem only if a bundling mechanism is used. A registry that uses indexing doesn't have to worry too much about this multiplicity of variants.
Siavash

On Thu, 18 Aug 2011 01:49:39 +0200, "Manal Ismail" <manal at tra.gov.eg>
wrote:
> Thanks for all the clarifications ..
> Frankly I was talking practically (which is hard to accurately
calculate)
> not theoretically ..
> But I fully agree with Sarmad that we should be catering for the worst
> case scenario or have a criteria that guarantees that we'll never reach
> that point ..
>
> Having said that, I have to admit that I don't fully understand option 2
> below .. if I understand right, containing the language table won't
limit
> the theoretical number of possible variants across the whole script,
right?
> so how would this solve the problem ?
>
> Kind Regards
>
> --Manal
>
> ________________________________
>
> From: sarmad.hussain at kics.edu.pk on behalf of Dr.Sarmad Hussain
> Sent: Wed 17/08/2011 07:58 PM
> To: Manal Ismail
> Cc: baher.esmat; Steve Sheng; arabic-vip at icann.org
> Subject: Re: [arabic-vip] WHOIS related query
>
>
> Dear Manal and All,
>
> Theoretically, if we have n letters in a label, and the letters have m
> variants each, then the total possibilities are m^n.  So for a 10 letter
> label, with say three variants per letter (e.g. kaf), we have 3^10
variants
> i.e. about 59,000.  Now add optional mark on each letter (two
> possibilities: with mark without mark per letter; assuming these are
> considered equivalent); for a single sequence of n letters, there 2^n
> possibilities, i.e. 1,000 approx.  Thus total possibilities with
variants
> and marks would be 59K*1K, which gives a order of 100's of millions (if
my
> mathematics is correct).
>
> So Raed's estimates are without aerab/diacritical marks, just on
letters.
>
> However, practically speaking, "real" words (if there is such a thing
for
> a label definition) would be fewer (this is most of the cases).
>
> Having said that, we must plan for boundary cases, not just "real" cases
> as the theoretical limits must also be catered for.
>
> Two possible solutions:
>
> 1. contain the variants by putting an upper limit
> 2. contain the language table to avoid generation of too many variants
> (harder to do, without significantly limiting linguistic expression)
>
> If we choose option 1, then we need terminology and mechanisms to
> articulate and enable this.
>
> I am not sure what is the best option at this time.
>
> regards,
> Sarmad
>
>
>
>
> On Wed, Aug 17, 2011 at 9:32 AM, Manal Ismail <manal at tra.gov.eg> wrote:
>
>
>       Does this has to do with using Diacritics ?
>
>       --Manal
>
>       ________________________________
>
>       From: arabic-vip-bounces at icann.org on behalf of baher.esmat
>       Sent: Wed 17/08/2011 02:30 PM
>       To: Steve Sheng; Sarmad Hussain
>       Cc: arabic-vip at icann.org
>       Subject: Re: [arabic-vip] WHOIS related query
>
>
>
>
>       On 8/16/11 9:15 PM, "Steve Sheng" <steve.sheng at icann.org> wrote:
>
>       > Another question is a stupid question from me, how many variants
could
>       > an
>       > Arabic label have? Is it in the order of 10s, 100s or 1000s we are
>       > talking
>       > about? This have obvious implications for WHOIS output and registry
>       > WHOIS
>       > services.
>
>       If my memory serves me right, Raed Al-Fayez of (.sa), also a member of
the
>       Arabic team, mentioned in a presentation at the ICANN meeting in
Singapore
>       that there were cases of variants ­ as per (.sa) policy ­ where the
>       number
>       of variants per a single label could be as many as ~64,000.
>
>       Baher

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm.icann.org/pipermail/arabic-vip/attachments/20110818/8382367f/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT60167 1.jpg
Type: image/jpeg
Size: 1692 bytes
Desc: ATT60167 1.jpg
Url : http://mm.icann.org/pipermail/arabic-vip/attachments/20110818/8382367f/ATT601671-0001.jpg 


More information about the arabic-vip mailing list