[UA-discuss] interesting to note about emoji in mailbox name.

Andre Schappo A.Schappo at lboro.ac.uk
Mon Apr 15 12:24:25 UTC 2019

I have frequently thought that one of reasons for the complexity of many standards/guidelines is that they encompass the whole of Unicode and hence there are few constraints and those constraints can be difficult to understand and agree upon.

I posit that with mailbox names, they can be categorised such that each category is more constrained and the constraints are more easily understood.

A mail service provider could impose further constraints.

Categories could be based on writing system/orthography. So one could define Japanese, Korean, Thai ...etc... categories for mailbox names.

Letʼs take category Japanese: A generalised standard could, for example, include some "Common" characters as well as Han, Hiragana and Katakana unicode.org/Public/UCD/latest/ucd/Scripts.txt<http://unicode.org/Public/UCD/latest/ucd/Scripts.txt>. A mail service provider could, for example, impose a further restriction by not allowing "Common" characters.

I give an example of Korean mailbox names at jsfiddle.net/coas/2uLhcfef<http://jsfiddle.net/coas/2uLhcfef> I only allow a Korean Hangul mailbox names with the provided Korean Hangul domain names.

...and... much more controversially one could define a Symbols category for mailbox names. Determining which symbols could/should be included in such a category would require a lot of research and consideration.

If I was a mail service provider I, most likely, would not allow mixing of categories in mailbox names.

André Schappo

On 13 Apr 2019, at 11:28, John Levine <john.levine at standcore.com<mailto:john.levine at standcore.com>> wrote:

In article <BYAPR21MB13171918C3D2AC0E8D177983D12F0 at BYAPR21MB1317.namprd21.prod.outlook.com<mailto:BYAPR21MB13171918C3D2AC0E8D177983D12F0 at BYAPR21MB1317.namprd21.prod.outlook.com>> you write:
UASG has not endorsed emojis as part of mailbox names and I doubt that we ever would.  But as mentioned below, some mail systems will take a more liberal approach.

First, I have to say that I am dismayed to see that many in the UASG
do not know that mailboxes and domain names are different and always
have been.  This is an important difference, and it's discussed at
some length in UASG 012.  This would probably be a good time for
everyone who hasn't read that document to read it now, so at least we
agree on the underlying facts.

As several people have pointed out, there are practically no rules for
what characters are technically legal in mailbox names, but that doesn't
mean that in practice you can put any junk in an address and expect it
to work.  For example, this is a valid address:

 "); @,?~]"@m.jl.ly

but that doesn't mean I would hand it out as an address to anyone from
whom I wanted mail.

Similarly, you can technically put random combinations of Hindi,
Arabic, Japanese, and emojis in a mailbox, but I wouldn't expect many
mail systems to deliver it and if they do deliver it I would expect
all sorts of warnings.

One of the glaring holes in the EAI documents is that there is no
practical advice on choosing mailbox names.  We have developed
conventions for ASCII names that LDH are fine, dots and plus signs and
maybe apostrophes are OK, upper and lower case ASCII are generally
interchagable, and beyond that you take your chances.  We need
appropriate guidance for mailbox names.

Before anyone suggests it, the rule for mailboxes can NOT be the same
as for IDNs, since a dot is not a separator, mailboxes have always
allowed characters not allowed in hostnames, and mail systems have
always done fuzzy matching to allow misspellings that wouldn't be
possible in domain names.

The IETF's PRECIS working group has advice on identifiers that would
be a good place to continue from.  I don't know if the IETF has the
energy to do that, or if people here could usefully contribute.


🌏 🌍 🌎
André Schappo

