[UA-discuss] Progress on HTML and email...

Mark Svancarek marksv at microsoft.com
Tue Nov 14 15:49:04 UTC 2017


Shawn is on the DL, but adding him explicitly for clarification.

-----Original Message-----
From: UA-discuss [mailto:ua-discuss-bounces at icann.org] On Behalf Of Andrew Sullivan
Sent: Monday, November 13, 2017 4:29 PM
To: ua-discuss at icann.org
Subject: Re: [UA-discuss] Progress on HTML and email...

On Mon, Nov 13, 2017 at 04:23:29PM +0000, Mark Svancarek wrote:
> My interpretation is that the user is a human who must enter a string of text into a web form, where it is cast as type Email (which can subsequently be converted into ULABELs if the typed-in string includes ALABELs).
> 

No, I don't think that's it.  This is the specification for HTML, not for the UI.  The user agent can do transformations.  So I _think_ it means that an email type of an input element, when it is _sent_ as input, has to be in this form; but that the input method could be different and the user agent could do a transformation on it so that Unicode user input (which could be in any form, recall) is transformed into a valid U-label/A-label pair before it becomes HTML form input.
(One feels the need for another word for "stuff that comes from the user in the UI" vs "stuff that ends up in the form as 'input' formally so defined".  There may be a term of art already in the specifications for this, but I'm not going to dig it out just now.)  For the purposes of wire transmission and storage the server-part is A-labels, but for the purposes of display they're U-labels.  Presumably, for the purposes of input they're whatever the user might input.  After all, Windows doesn't even use UTF-8 input natively, so it would literally be impossible for a Windows user to input correct UTF-8 at all.  We probably need someone who is working directly on browser code to say more about how this works in practice.  Maybe Shawn Steele knows?

I suspect this is slightly more obscure in the specification than I at least would like because of some of the WHATWG/W3C politics around HTML5.  (Some of the principals in WHATWG don't believe that IDAN2008 is a thing.  I will leave divining the consequences of using an IDNA specification that does not have a perfect 1:1 A-label/U-label mapping as an exercise for the reader, but I note that IDNA2008 doesn't solve the need for mappings: upper case characters aren't allowed in
IDNA2008 U-labels.)

Best regards,

A

--
Andrew Sullivan
ajs at anvilwalrusden.com


More information about the UA-discuss mailing list