[UA-discuss] Regular Expression

Chaals McCathie Nevile chaals at yandex.ru
Tue Sep 12 22:01:07 UTC 2017


On Tue, 12 Sep 2017 22:43:09 +0200, Don Hollander
<don.hollander at icann.org> wrote:

> Thanks Rubens.  Which raises the question as to when the validation  
> takes place.  Before or after a >punycode transformation.

I would generally like validation to take place after punycode conversion.
First because there are strings that match the regex bu not punycode
constraints. Likewise I agree with Rubens that assuming TLDs are not
domains and email must go to a subdomain seems less than prescient with
hindsight.

> And David, thanks for the article.   The UASG has long advocated turning  
> validation off - but very >few active practitioners seem willing think  
> outside that box.

I'm not entirely convinced by that approach either.

I think there is value in validation - first, to determine whether an
email address is real - if it isn't, you are probably better off getting a
warning than trying to send it.

Second, I find it very helpful, including as a protection against phishing
emails, to be told if an email is not recognised as a contact to whom I
have *sent* an email, which is a stricter validation check. Applications
that do that for me - especially for scripts I don't read fluently like
Chinese - are common, and I would be upset if they were to stop validating.

On the other hand, incorrect validation, e.g. of an address in a form, with
no punycode conversion run first and no reason not to accept an
internationalised email is clearly a bad idea - largely since it fails to
actually validate whether something is a valid email address.

A given application or toolchain may be incapable of handling some valid
email addresses, but I think a campaign to convince developers to produce
a statement like "this application is second-rate and obsolete" would face
significant challenges. Whether it is worth pushing for such applications  
to state that they do not yet support appropriate standards may be worth  
considering...

cheers

Chaals

>
>
>
> D
>
>
>> On 13/09/2017, at 8:31 AM, Rubens Kuhl <rubensk at nic.br> wrote:
>>
>>
>>> On Sep 12, 2017, at 3:44 PM, Don Hollander <don.hollander at icann.org>  
>>> wrote:
>>>
>>> Please note that this is a Geeky post - so carry on if that’s not you.
>>>
>>>
>>> Email validation is an area where many websites fall short as we found  
>>> in our study on Website UA >>>Readiness (nearing publication)
>>>
>>> The technologies behind these websites generally use a Regular  
>>> Expression as their first line of >>>defence against rubbish data.    
>>> The issue is that most of these RegExs are overly restrictive.
>>>
>>> As an appendix to the Website review, we looked at some of the  
>>> technologies behind the websites to >>>see if there were common  
>>> denominators for good and bad experiences.
>>>
>>> One RegEx has stood out as being simple and correct.   I’d like the  
>>> UASG to consider recommending >>>this in our documentation.   Toward  
>>> that end, this thread is for discussion.
>>>
>>> /^.+@(?:[^.]+\.)+(?:[^.]{2,})$
>>>
>>>
>>> Regular expression check in Javascript. This accepts any Unicode  
>>> characters, only insisting that >>>the domain must have more than one  
>>> label and the TLD is 2 characters or longer.
>>> Your thoughts?
>>
>> Single IDN TLDs for some scripts is something being considered for  
>> subsequent procedures, so I >>would think of 1 or more and prevent the  
>> same UA challenges previous rounds TLDs are suffering.
>>
>> Rubens
>>
>>
>
>
> Don Hollander
> Universal Acceptance Steering Group
> Skype: don_hollander
>
>
>
>
>
>



-- 
Chaals is Charles McCathie Nevile
find more at http://yandex.com


More information about the UA-discuss mailing list