[UA-discuss] Progress on HTML and email...

Andrew Sullivan ajs at anvilwalrusden.com
Tue Nov 14 00:28:47 UTC 2017


On Mon, Nov 13, 2017 at 04:23:29PM +0000, Mark Svancarek wrote:
> My interpretation is that the user is a human who must enter a string of text into a web form, where it is cast as type Email (which can subsequently be converted into ULABELs if the typed-in string includes ALABELs).
> 

No, I don't think that's it.  This is the specification for HTML, not
for the UI.  The user agent can do transformations.  So I _think_ it
means that an email type of an input element, when it is _sent_ as
input, has to be in this form; but that the input method could be
different and the user agent could do a transformation on it so that
Unicode user input (which could be in any form, recall) is transformed
into a valid U-label/A-label pair before it becomes HTML form input.
(One feels the need for another word for "stuff that comes from the
user in the UI" vs "stuff that ends up in the form as 'input' formally
so defined".  There may be a term of art already in the specifications
for this, but I'm not going to dig it out just now.)  For the purposes
of wire transmission and storage the server-part is A-labels, but for
the purposes of display they're U-labels.  Presumably, for the
purposes of input they're whatever the user might input.  After all,
Windows doesn't even use UTF-8 input natively, so it would literally
be impossible for a Windows user to input correct UTF-8 at all.  We
probably need someone who is working directly on browser code to say
more about how this works in practice.  Maybe Shawn Steele knows?

I suspect this is slightly more obscure in the specification than I at
least would like because of some of the WHATWG/W3C politics around
HTML5.  (Some of the principals in WHATWG don't believe that IDAN2008
is a thing.  I will leave divining the consequences of using an IDNA
specification that does not have a perfect 1:1 A-label/U-label mapping
as an exercise for the reader, but I note that IDNA2008 doesn't solve
the need for mappings: upper case characters aren't allowed in
IDNA2008 U-labels.)

Best regards,

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com


More information about the UA-discuss mailing list