<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">I totally agree with Jordyn&nbsp;<div class="">and Mark "Just capture the string and send a test message.”</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 14, 2017, at 10:38 AM, Jordyn Buchanan via UA-discuss &lt;<a href="mailto:ua-discuss@icann.org" class="">ua-discuss@icann.org</a>&gt; wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Also worth remembering that "works according to the universe at the moment the RegExp was written" is how we got into a lot of today's UA mess in the first place.&nbsp; Just because dotless domains or some other rule is in place today, I'd want to avoid encoding them into a regexp that we tell people to use since the rules may change again and I don't want to have another group following along in our wake 10 years from now trying to undo the code that we told everyone to write.<div class=""><br class=""><div class="">Jordyn</div></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Thu, Sep 14, 2017 at 1:27 PM, Rubens Kuhl <span dir="ltr" class="">&lt;<a href="mailto:rubensk@nic.br" target="_blank" class="">rubensk@nic.br</a>&gt;</span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="">

The BiDi issue suggests to me that even enforcing the non-dotless rule is too much for a simple regex, as shabaka.example@don is a valid Arabic EAI , while the same ASCII combination is not valid even if a .don TLD gets delegated.<br class="">

[non-empty]@[non-empty] looks better to me.<br class="">

<span class="HOEnZb"><font color="#888888" class=""><br class="">

<br class="">

Rubens<br class="">

</font></span><div class="HOEnZb"><div class="h5"><br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

&gt; Em 14 de set de 2017, à(s) 13:58:000, Don Hollander &lt;<a href="mailto:don.hollander@icann.org" class="">don.hollander@icann.org</a>&gt; escreveu:<br class="">

&gt;<br class="">

&gt; Thanks Jim.<br class="">

&gt;<br class="">

&gt; The BiDi issue, with raw data input, is which side has the domain side.<br class="">

&gt;<br class="">

&gt; usually you’ll encounter <a href="mailto:mailbox@domainname.tld" class="">mailbox@domainname.tld</a><br class="">

&gt;<br class="">

&gt; But in Arabic or Hebrew you’ll encounter tld.domainname@mailbox<br class="">

&gt;<br class="">

&gt; Don<br class="">

&gt;<br class="">

&gt;<br class="">

&gt;&gt; On 15/09/2017, at 3:44 AM, Jim Hague &lt;<a href="mailto:jim@sinodun.com" class="">jim@sinodun.com</a>&gt; wrote:<br class="">

&gt;&gt;<br class="">

&gt;&gt; On 12/09/2017 19:44, Don Hollander wrote:<br class="">

&gt;&gt;&gt; One RegEx has stood out as being simple and correct.&nbsp; &nbsp;I’d like the UASG<br class="">

&gt;&gt;&gt; to consider recommending this in our documentation.&nbsp; &nbsp;Toward that end,<br class="">

&gt;&gt;&gt; this thread is for discussion.<br class="">

&gt;&gt;&gt;<br class="">

&gt;&gt;&gt; /^.+@(?:[^.]+\.)+(?:[^.]{2,})$<br class="">

&gt;&gt;&gt;<br class="">

&gt;&gt;&gt; Regular expression check in Javascript. This accepts any Unicode<br class="">

&gt;&gt;&gt; characters, only insisting that the domain must have more than one label<br class="">

&gt;&gt;&gt; and the TLD is 2 characters or longer.<br class="">

&gt;&gt;<br class="">

&gt;&gt; Note that this in the context of an in-browser check. I only examined a<br class="">

&gt;&gt; small random subset of the sites surveyed in the main evaluation, and<br class="">

&gt;&gt; obviously without access to server code could only examine client-side<br class="">

&gt;&gt; operations. In all the sites I examined, the only check performed was<br class="">

&gt;&gt; against one (or in one case two) regular expression(s). No decomposition<br class="">

&gt;&gt; of the email address was attempted, and certainly no translation of the<br class="">

&gt;&gt; domain to Punycode.<br class="">

&gt;&gt;<br class="">

&gt;&gt; It was in that context that I highlighted the above regex, on the basis<br class="">

&gt;&gt; that it's probably the only sensible option to suggest to organisations<br class="">

&gt;&gt; as a low-impact UA improvement (I won't say fix) at the moment. If a<br class="">

&gt;&gt; future evaluation exercise verifies that an existing Javascript module<br class="">

&gt;&gt; does the right thing, that would be a better alternative, but that would<br class="">

&gt;&gt; involve more substantial modifications to site code.<br class="">

&gt;&gt;<br class="">

&gt;&gt; I agree that modifying it to allow 1 character TLDs would be sensible.<br class="">

&gt;&gt;<br class="">

&gt;&gt; I also agree with the page referenced at the start of the thread (which<br class="">

&gt;&gt; I read before working on the report) that just checking for '@' is about<br class="">

&gt;&gt; all one should attempt, certainly client-side.<br class="">

&gt;&gt;<br class="">

&gt;&gt; Turning again to the above regex, of course, being a proposed regex for<br class="">

&gt;&gt; validating email addresses, it's got an obvious deficiency. It needs to<br class="">

&gt;&gt; add support for other label separators (e.g. open dot).<br class="">

&gt;&gt;<br class="">

&gt;&gt; Mark Svancarek raised the excellent point of bidi in the domain.<br class="">

&gt;&gt; Personally I'm not confident I understand the bidi rules. But if the<br class="">

&gt;&gt; regex requires at least one label separator character in the domain and<br class="">

&gt;&gt; non-empty labels, will that work, given that if the regex allows 1<br class="">

&gt;&gt; character TLDs then a valid TLD is simply a non-empty label?<br class="">

&gt;&gt; --<br class="">

&gt;&gt; Jim Hague - <a href="mailto:jim@sinodun.com" class="">jim@sinodun.com</a>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Never trust a computer you can't lift.<br class="">

&gt;<br class="">

&gt; Don Hollander<br class="">

&gt; Universal Acceptance Steering Group<br class="">

&gt; Skype: don_hollander<br class="">

&gt;<br class="">

&gt;<br class="">

&gt;<br class="">

<br class="">

</div></div></blockquote></div><br class=""></div>

</div></blockquote></div><br class=""></div></body></html>