[UA-discuss] Programming Language Hacks - UA103

Stuart Stuple stuartst at microsoft.com
Wed Aug 9 20:46:27 UTC 2017


Yes, bidi is hard but fascinating.

From my work with text stacks, my understanding is that the assumption that something that is rtl.rtl.ltr has a predetermined rendering order is incorrect. It really will depend upon what is seen as the first strongly typed character in the first domain name. The Arabic/Hebrew/N’ko scripts all have an RTL script order within the RTL text direction for each language. Arabic and Hebrew both have characters commonly used (Unicode common) that the BiDi algorithm is required to treat as strongly typed LRT script order. Because of that, I doubt it’s enough to specify just the text direction for each element.

From: ua-discuss-bounces at icann.org [mailto:ua-discuss-bounces at icann.org] On Behalf Of Richard Merdinger
Sent: Wednesday, August 9, 2017 1:31 PM
To: Andrew Sullivan <ajs at anvilwalrusden.com>; ua-discuss at icann.org
Subject: Re: [UA-discuss] Programming Language Hacks - UA103

Makes sense to me; I like mentioning the major-use writing system to make the point, but it also makes it clear that it is broader than a single case.

--Rich

Richard Merdinger
VP, Domains - GoDaddy
rmerdinger at godaddy.com<mailto:rmerdinger at godaddy.com>



From: <ua-discuss-bounces at icann.org<mailto:ua-discuss-bounces at icann.org>> on behalf of Andrew Sullivan <ajs at anvilwalrusden.com<mailto:ajs at anvilwalrusden.com>>
Date: Wednesday, August 9, 2017 at 3:19 PM
To: "ua-discuss at icann.org<mailto:ua-discuss at icann.org>" <ua-discuss at icann.org<mailto:ua-discuss at icann.org>>
Subject: Re: [UA-discuss] Programming Language Hacks - UA103

On Wed, Aug 09, 2017 at 04:13:35PM +0000, Mark Svancarek via UA-discuss wrote:
Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering.  So I like calling it out explicitly.

This has been a regularly-recurring bug in various rendering engines
since at least 2008, because I recall the demonstrations of it during
the idnabis WG, and then seeing it in a completely different context
during the VIP work for ICANN in 2011 or '12.  It's not always only
Arabic: at least one of the examples was reproducible in any bidi
context.  I seem to recall one example where the wire order

    [firstlabel]RTL[secondlabel]RTL[thirdlabel]LTR[fourthlabel]NULL

got rendered as

    RTL.LTR.RTL

Which I thought was a pretty cool bug.  I have no idea how it happened
that way, though I recall walking mysef through the bidi algorithm at
the time and figuring out what the problem must have been.  Bidi is
hard.

I therefore think it wise not to call out Arabic especially -- but
maybe point out that Arabic is perhaps the most prominent writing
system that uses RTL, so that programmers aren't tempted to dismiss
the problem as a "corner case".  Big corner, the Arabic-using
population!

Best regards,

A

--
Andrew Sullivan
ajs at anvilwalrusden.com<mailto:ajs at anvilwalrusden.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170809/18cbe1f0/attachment.html>


More information about the UA-discuss mailing list