[UA-discuss] Regular Expression

Paul Stahura paul at donuts.email
Thu Sep 14 17:46:27 UTC 2017


I totally agree with Jordyn
and Mark "Just capture the string and send a test message.”

> On Sep 14, 2017, at 10:38 AM, Jordyn Buchanan via UA-discuss <ua-discuss at icann.org> wrote:
> 
> Also worth remembering that "works according to the universe at the moment the RegExp was written" is how we got into a lot of today's UA mess in the first place.  Just because dotless domains or some other rule is in place today, I'd want to avoid encoding them into a regexp that we tell people to use since the rules may change again and I don't want to have another group following along in our wake 10 years from now trying to undo the code that we told everyone to write.
> 
> Jordyn
> 
> On Thu, Sep 14, 2017 at 1:27 PM, Rubens Kuhl <rubensk at nic.br <mailto:rubensk at nic.br>> wrote:
> 
> The BiDi issue suggests to me that even enforcing the non-dotless rule is too much for a simple regex, as shabaka.example at don is a valid Arabic EAI , while the same ASCII combination is not valid even if a .don TLD gets delegated.
> [non-empty]@[non-empty] looks better to me.
> 
> 
> Rubens
> 
> 
> 
> 
> 
> 
> 
> 
> > Em 14 de set de 2017, à(s) 13:58:000, Don Hollander <don.hollander at icann.org <mailto:don.hollander at icann.org>> escreveu:
> >
> > Thanks Jim.
> >
> > The BiDi issue, with raw data input, is which side has the domain side.
> >
> > usually you’ll encounter mailbox at domainname.tld
> >
> > But in Arabic or Hebrew you’ll encounter tld.domainname at mailbox
> >
> > Don
> >
> >
> >> On 15/09/2017, at 3:44 AM, Jim Hague <jim at sinodun.com <mailto:jim at sinodun.com>> wrote:
> >>
> >> On 12/09/2017 19:44, Don Hollander wrote:
> >>> One RegEx has stood out as being simple and correct.   I’d like the UASG
> >>> to consider recommending this in our documentation.   Toward that end,
> >>> this thread is for discussion.
> >>>
> >>> /^.+@(?:[^.]+\.)+(?:[^.]{2,})$
> >>>
> >>> Regular expression check in Javascript. This accepts any Unicode
> >>> characters, only insisting that the domain must have more than one label
> >>> and the TLD is 2 characters or longer.
> >>
> >> Note that this in the context of an in-browser check. I only examined a
> >> small random subset of the sites surveyed in the main evaluation, and
> >> obviously without access to server code could only examine client-side
> >> operations. In all the sites I examined, the only check performed was
> >> against one (or in one case two) regular expression(s). No decomposition
> >> of the email address was attempted, and certainly no translation of the
> >> domain to Punycode.
> >>
> >> It was in that context that I highlighted the above regex, on the basis
> >> that it's probably the only sensible option to suggest to organisations
> >> as a low-impact UA improvement (I won't say fix) at the moment. If a
> >> future evaluation exercise verifies that an existing Javascript module
> >> does the right thing, that would be a better alternative, but that would
> >> involve more substantial modifications to site code.
> >>
> >> I agree that modifying it to allow 1 character TLDs would be sensible.
> >>
> >> I also agree with the page referenced at the start of the thread (which
> >> I read before working on the report) that just checking for '@' is about
> >> all one should attempt, certainly client-side.
> >>
> >> Turning again to the above regex, of course, being a proposed regex for
> >> validating email addresses, it's got an obvious deficiency. It needs to
> >> add support for other label separators (e.g. open dot).
> >>
> >> Mark Svancarek raised the excellent point of bidi in the domain.
> >> Personally I'm not confident I understand the bidi rules. But if the
> >> regex requires at least one label separator character in the domain and
> >> non-empty labels, will that work, given that if the regex allows 1
> >> character TLDs then a valid TLD is simply a non-empty label?
> >> --
> >> Jim Hague - jim at sinodun.com <mailto:jim at sinodun.com>          Never trust a computer you can't lift.
> >
> > Don Hollander
> > Universal Acceptance Steering Group
> > Skype: don_hollander
> >
> >
> >
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170914/2f9871db/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170914/2f9871db/signature.asc>


More information about the UA-discuss mailing list