[UA-discuss] Punycode Converters

Fri Nov 3 19:53:30 UTC 2017

Don:

In talking about punycode convertors that "produce bad results", we 
probably have to distinguish between a narrow, technical view of "bad 
results", and a more system-level, user view of "bad results". Which did 
the UASG Workshop discussion refer to?

Specifically, to your questions,

On 2017-11-03 09:42, Don Hollander wrote:
> During the UASG Workshop in Abu Dhabi there was a brief discussion about Punycode converters.
>
> 1)	Is anyone aware of any punycode converters (particularly in libraries) that produce bad results?
As a software engineer, I'm confident that in the narrow technical 
sense, many punycode converters produce bad results. In other words, 
they probably have bugs.  Most software does.  They might be rare, however.

Also, I'm confident that many apps or systems which use 
internationalised domain names do the conversion to and from A-Label 
form (punycode conversion) wrong, even if the libraries they use behave 
correctly. This would be due to bugs in how the app or system uses the 
library.
> 2)	Is there a test suite that can be used to test Punycode converters?
In the narrow, technical sense, our UASG018 /Programming Languages 
Evaluation Criteria/ document is a test suite, or at least instructions 
on how to construct a test suite. The obvious next step in the UASG018 
is to implement actual test suites, runnable software test code, which 
exercise the library's Punycode conversion functionality (among other 
things).

In the system-level, user view, our other evaluation activities would be 
that "test suite". For instance, the /Evaluation of UA Readiness of 
Popular Websites/, the /Universal Acceptance of Popular Browser 
(UASG016)/, etc.
> 3)	Would the source of input (typed, cut/paste, input from a data file) make any difference?   This probably has to do with RTL scripts

In the narrow, technical sense, the source of input should make no 
difference at all. The Punycode conversion algorithm doesn't depend on 
the source of input.  It starts with a sequence of data, and the source 
of that data is not material.

In the system-level, user view, the source of input might well make a 
difference. I would expect that this takes the form of how the app 
handles the data before it calls the library. When the user selects a 
domain name, does the app select all the necessary characters? Does the 
app implement the Unicode bidi algorithm correctly, for text with both 
right-to-left and left-to-right components? Does the app pass the domain 
name correctly to the library? And so on.

> Thanks.
>
> Don
>
> Don Hollander
> Universal Acceptance Steering Group
> Skype: don_hollander
>
>
>

-- 
     --Jim DeLaHunt, jdlh at jdlh.com     http://blog.jdlh.com/ (http://jdlh.com/)
       multilingual websites consultant

       355-1027 Davie St, Vancouver BC V6E 4L2, Canada
          Canada mobile +1-604-376-8953

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20171103/0c0359be/attachment.html>