<div dir="ltr">Also worth remembering that "works according to the universe at the moment the RegExp was written" is how we got into a lot of today's UA mess in the first place. Just because dotless domains or some other rule is in place today, I'd want to avoid encoding them into a regexp that we tell people to use since the rules may change again and I don't want to have another group following along in our wake 10 years from now trying to undo the code that we told everyone to write.<div><br><div>Jordyn</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Sep 14, 2017 at 1:27 PM, Rubens Kuhl <span dir="ltr"><<a href="mailto:rubensk@nic.br" target="_blank">rubensk@nic.br</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
The BiDi issue suggests to me that even enforcing the non-dotless rule is too much for a simple regex, as shabaka.example@don is a valid Arabic EAI , while the same ASCII combination is not valid even if a .don TLD gets delegated.<br>
[non-empty]@[non-empty] looks better to me.<br>
<span class="HOEnZb"><font color="#888888"><br>
<br>
Rubens<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
> Em 14 de set de 2017, à(s) 13:58:000, Don Hollander <<a href="mailto:don.hollander@icann.org">don.hollander@icann.org</a>> escreveu:<br>
><br>
> Thanks Jim.<br>
><br>
> The BiDi issue, with raw data input, is which side has the domain side.<br>
><br>
> usually you’ll encounter mailbox@domainname.tld<br>
><br>
> But in Arabic or Hebrew you’ll encounter tld.domainname@mailbox<br>
><br>
> Don<br>
><br>
><br>
>> On 15/09/2017, at 3:44 AM, Jim Hague <<a href="mailto:jim@sinodun.com">jim@sinodun.com</a>> wrote:<br>
>><br>
>> On 12/09/2017 19:44, Don Hollander wrote:<br>
>>> One RegEx has stood out as being simple and correct. I’d like the UASG<br>
>>> to consider recommending this in our documentation. Toward that end,<br>
>>> this thread is for discussion.<br>
>>><br>
>>> /^.+@(?:[^.]+\.)+(?:[^.]{2,})$<br>
>>><br>
>>> Regular expression check in Javascript. This accepts any Unicode<br>
>>> characters, only insisting that the domain must have more than one label<br>
>>> and the TLD is 2 characters or longer.<br>
>><br>
>> Note that this in the context of an in-browser check. I only examined a<br>
>> small random subset of the sites surveyed in the main evaluation, and<br>
>> obviously without access to server code could only examine client-side<br>
>> operations. In all the sites I examined, the only check performed was<br>
>> against one (or in one case two) regular expression(s). No decomposition<br>
>> of the email address was attempted, and certainly no translation of the<br>
>> domain to Punycode.<br>
>><br>
>> It was in that context that I highlighted the above regex, on the basis<br>
>> that it's probably the only sensible option to suggest to organisations<br>
>> as a low-impact UA improvement (I won't say fix) at the moment. If a<br>
>> future evaluation exercise verifies that an existing Javascript module<br>
>> does the right thing, that would be a better alternative, but that would<br>
>> involve more substantial modifications to site code.<br>
>><br>
>> I agree that modifying it to allow 1 character TLDs would be sensible.<br>
>><br>
>> I also agree with the page referenced at the start of the thread (which<br>
>> I read before working on the report) that just checking for '@' is about<br>
>> all one should attempt, certainly client-side.<br>
>><br>
>> Turning again to the above regex, of course, being a proposed regex for<br>
>> validating email addresses, it's got an obvious deficiency. It needs to<br>
>> add support for other label separators (e.g. open dot).<br>
>><br>
>> Mark Svancarek raised the excellent point of bidi in the domain.<br>
>> Personally I'm not confident I understand the bidi rules. But if the<br>
>> regex requires at least one label separator character in the domain and<br>
>> non-empty labels, will that work, given that if the regex allows 1<br>
>> character TLDs then a valid TLD is simply a non-empty label?<br>
>> --<br>
>> Jim Hague - <a href="mailto:jim@sinodun.com">jim@sinodun.com</a> Never trust a computer you can't lift.<br>
><br>
> Don Hollander<br>
> Universal Acceptance Steering Group<br>
> Skype: don_hollander<br>
><br>
><br>
><br>
<br>
</div></div></blockquote></div><br></div>