[UA-discuss] Latest Draft of UA Best Practices document
A.Schappo at lboro.ac.uk
Sun Oct 11 16:31:28 UTC 2015
Consider http://tools.ietf.org/html/rfc5895 section 2
You have already covered mapping of U+3002 IDEOGRAPHIC FULL STOP to U+002E FULL STOP
There is also the general mapping of fullwidth and halfwidth characters to their decomposition characters
I am thinking specifically of the decomposition mapping of ／ U+FF0F FULLWIDTH SOLIDUS to / U+002F SOLIDUS. With the Chinese Input Methods I have used, U+FF0F is output when I hit the slash key which I believe is common practice.
So now lets see what happens with an IDN when U+FF0F is used.
http://南昌大学.中国/ (using U+002F as a terminator or as a separator with null pathname)
http://南昌大学.中国／ (using U+FF0F) produces http://www.南昌大学.xn--/-kq6ay5z/ . It has thus correctly performed the decomposition mapping but has then combined / into the TLD to produce a non existent TLD
Taking an address with a pathname part gives the same behaviour
http://www.เครื่องรัดกล่อง.ไทย/ระบบแบตเตอรี่+รุ่น+ZP+22+248.html (using U+002F as the separator)
http://www.เครื่องรัดกล่อง.xn--/++zp+22+248-r50ba1a0bnaa0nwafb1vpc9godtl0a6f5gd.html/ (using U+FF0F)
I conducted these tests using Firefox on OSX.
So, an agent should, after the above mapping treat the / as a separator and not as a component character of the TLD. I have briefly tried the above with Safari & Chrome and it appears to be a common problem.
On 8 Oct 2015, at 22:53, Mark Svancarek wrote:
> I’m still not finished for Dublin, but there are sufficient edits to this draft to make it worth asking you review and provide feedback. Feel free to edit and comment directly into the doc.
More information about the UA-discuss