[UA-discuss] Punycode Converters

Fri Nov 3 20:29:16 UTC 2017

On 11/3/2017 12:53 PM, Jim DeLaHunt wrote:
>
> Don:
>
> In talking about punycode convertors that "produce bad results", we 
> probably have to distinguish between a narrow, technical view of "bad 
> results", and a more system-level, user view of "bad results". Which 
> did the UASG Workshop discussion refer to?

It's not just the converters in the browsers, it's converters in various 
platform libraries as well. Some of those are not even well documented, 
so you can't tell, without experimentation, what rules they follow.

In this context, will UASG adopt a position vis-a-vis UTS#46? That 
standard attempts to somehow handle both IDNA2003 and IDNA2008 labels. I 
haven't looked into to what degree it fails valid IDNA2008 labels, but 
it certainly handles many IDNA2003 ones.

Naively, "universal acceptance" would seem to mean you'd want this kind 
of permissive handling, but it lands you deep in the morass of emoji 
labels, among other things.

A./
>
> Specifically, to your questions,
>
> On 2017-11-03 09:42, Don Hollander wrote:
>> During the UASG Workshop in Abu Dhabi there was a brief discussion about Punycode converters.
>>
>> 1)	Is anyone aware of any punycode converters (particularly in libraries) that produce bad results?
> As a software engineer, I'm confident that in the narrow technical 
> sense, many punycode converters produce bad results. In other words, 
> they probably have bugs.  Most software does.  They might be rare, 
> however.
>
> Also, I'm confident that many apps or systems which use 
> internationalised domain names do the conversion to and from A-Label 
> form (punycode conversion) wrong, even if the libraries they use 
> behave correctly. This would be due to bugs in how the app or system 
> uses the library.
>> 2)	Is there a test suite that can be used to test Punycode converters?
> In the narrow, technical sense, our UASG018 /Programming Languages 
> Evaluation Criteria/ document is a test suite, or at least 
> instructions on how to construct a test suite. The obvious next step 
> in the UASG018 is to implement actual test suites, runnable software 
> test code, which exercise the library's Punycode conversion 
> functionality (among other things).
>
> In the system-level, user view, our other evaluation activities would 
> be that "test suite". For instance, the /Evaluation of UA Readiness of 
> Popular Websites/, the /Universal Acceptance of Popular Browser 
> (UASG016)/, etc.
>> 3)	Would the source of input (typed, cut/paste, input from a data file) make any difference?   This probably has to do with RTL scripts
>
> In the narrow, technical sense, the source of input should make no 
> difference at all. The Punycode conversion algorithm doesn't depend on 
> the source of input.  It starts with a sequence of data, and the 
> source of that data is not material.
>
> In the system-level, user view, the source of input might well make a 
> difference. I would expect that this takes the form of how the app 
> handles the data before it calls the library. When the user selects a 
> domain name, does the app select all the necessary characters?  Does 
> the app implement the Unicode bidi algorithm correctly, for text with 
> both right-to-left and left-to-right components? Does the app pass the 
> domain name correctly to the library? And so on.
>
>> Thanks.
>>
>> Don
>>
>> Don Hollander
>> Universal Acceptance Steering Group
>> Skype: don_hollander
>>
>>
>>
>
> -- 
>      --Jim DeLaHunt,jdlh at jdlh.com      http://blog.jdlh.com/  (http://jdlh.com/)
>        multilingual websites consultant
>
>        355-1027 Davie St, Vancouver BC V6E 4L2, Canada
>           Canada mobile +1-604-376-8953

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20171103/9ea8c2f2/attachment.html>