[UA-discuss] UASG010 - Quick Guide to Linkification

AJAY DATA ajay at data.in
Sat Sep 30 05:43:41 UTC 2017


I think we have not also considered this RFC 3896 for URL normalization.


https://tools.ietf.org/html/rfc3986

P
Let's also consider and make it part of recommendation.

Now on mix script. :- 

When URL is containing absolute  page and parameter http://www.example.com/page.html?parameter=1

Or 

http://www.example.com/पेज.html?parameter=1

Or 

http://www.example.com/पेज/page.html?parameter=1

Notice path and page.html is  in Hindi  and allowed . 

So, Tex may be we need to consider only till TLD , after that it can be mix script and may have to be allowed. 

Wish wishes

Ajay

On 30 September 2017 08:52:47 GMT+05:30, Tex Texin <textexin at xencraft.com> wrote:
>Asmus, I can understand why forum operators would apply those
>restrictions, but as you indicate, they don’t apply to new links in
>other contexts.
>
> 
>
>However, coming back to linkification, the guidelines don’t address the
>query and fragment portions of a URL. As with the distinction to apply
>the script set rules to labels, it is worth pointing out in the
>guidelines that those rules do not apply to the portion after the “?”
>unless perhaps the query portion in turn is a URL.
>
> 
>
>Example 1  http://domain.com/?refer= http://newdomain.com 
>
> 
>
>Example 2  http://domain.com/?title=script1
><http://domain.com/?title=script1&author=script2&quote=script1+script2+script3>
>&author=script2&quote=script1+script2+script3
>
> 
>
>The guidelines doesn’t address escapes. Should linkification attempt to
>unescape escaped characters? That might make the process much more
>complex. However, ignoring escapes might also lead to very inconsistent
>results.
>
> 
>
>tex
>
> 
>
> 
>
>From: Asmus Freytag (c) [mailto:asmusf at ix.netcom.com] 
>Sent: Friday, September 29, 2017 7:41 PM
>To: Tex Texin; 'Mark Svancarek'
>Cc: 'Universal Acceptance'
>Subject: Re: [UA-discuss] UASG010 - Quick Guide to Linkification
>
> 
>
>On 9/29/2017 5:28 PM, Tex Texin wrote:
>
>Thanks Mark and Asmus.
>
> 
>
>I agree about the distinction of script mixing within a label. The
>guide should clarify this.
>
> 
>
>Also support Asmus clarification regarding ASCII vs all of Latin. 
>
> 
>
>For the attention to mixing digits within a label, I agree although I
>would need to review if I can easily know which digits are widely used
>vs of historical interest. I don’t believe that being a bit broad in
>linkification acceptance is a problem. The domain registry (and perhaps
>servers) should be more restrictive to not allow domains that could
>represent spoofing. (I know there are problems with reliance on
>registries). Being too restrictive in linkification could hurt users
>that need to enter a legitimate URL and can’t.
>
>
>Digits come in sets that are specified as such in the Unicode Standard
>(although implicitly: the members of such sets have a property "decimal
>digit" and Unicode follows the convention of encoding these in complete
>sets from 0-9). Therefore, not linkifying something that contains a
>mixture of these sets can be implemented deterministically (although
>regex syntax leads to particularly grim expressions for specifying this
>constraint, it can be done).
>
>Realistically, only the modern set of about 30 scripts is of practical
>importance, so a scheme that does not track the addition of future
>historic alphabets in Unicode would be adequate....
>
>Where native digits are (largely) historic holdovers, we wouldn't need
>them at all, but linkification isn't a good place to filter those. 
>
>Some reluctance on automatic conversions of "risky" URLs would be a
>benefit; it's along the same line as not linkifying something not under
>the author's control: the risk for mischief is just too great.
>
>Forum software that I have been a user of tended to implement three
>restrictions that are not related to new TLDs or IDN TLDs:
>
>1) limit file names by extension (e.g. if the link entered was supposed
>to be for an image, do not allow it to link to something that doesn't
>have a common image file extension).
>
>2) disallow any link with a "?" in it - rationale: it's not a static
>link and who knows what will be served later (including risky stuff)
>
>3) require http://, etc., even in text spans that are marked as being
>URLs or in link attributes.
>
>In some cases these restrictions were deliberate decisions by forum
>operators - part of reigning in certain kinds of forum spam. 
>
>I feel we need to be cognizant of the needs for limiting the risk
>profile of certain operations - in particular where the result then
>winds up online to an open audience (as opposed to just sharing
>something in a private message). 
>
>The alternative for an operator is to simply blacklist specific TLDs
>and domains (and most of those will be any IDNs that are not local to
>the operator...). 
>
>
>
>
>
> 
>
>And it seems we need to clarify our implied intent for the guidance
>about the “implied intent of user’s entry”.  J (I couldn’t resist any
>longer.)
>
>:)
>
>
>
> 
>
>Tex
>
> 
>
> 
>
> 
>
>From: ua-discuss-bounces at icann.org
>[mailto:ua-discuss-bounces at icann.org] 
>On Behalf Of Mark Svancarek via UA-discuss
>Sent: Friday, September 29, 2017 5:04 PM
>To: Asmus Freytag; ua-discuss at icann.org
>Subject: Re: [UA-discuss] UASG010 - Quick Guide to Linkification
>
> 
>
>Hmm, I don’t recall approving that principle (hopefully it was added
>while I was out on leave, and not just because I carelessly failed to
>notice it was being added).
>
> 
>
>I mention that because it seems the opposite of what we could recommend
>i.e. we SHOULD allow use of Highly Restrictive and continue to
>discourage Moderately Restrictive.  Do we need to revisit this?  Sorry
>if I am just confused.
>
> 
>
>Note that, as Asmus points out, our concern is about script-mixing
>within a label, not use of different scripts in different labels. 
>Tex’s examples are all the latter, and should linkify cleanly by
>UA-ready SW.
>
> 
>
>/marksv
>
> 
>
>From: ua-discuss-bounces at icann.org
>[mailto:ua-discuss-bounces at icann.org] 
>On Behalf Of Asmus Freytag
>Sent: Friday, September 29, 2017 2:40 PM
>To: ua-discuss at icann.org
>Subject: Re: [UA-discuss] UASG010 - Quick Guide to Linkification
>
> 
>
>On 9/29/2017 2:26 PM, Tex Texin wrote:
>
>Hi,
>
>Some questions:
>
> 
>
>1.	Do I understand correctly, that the recommendation to not linkify
>highly restrictive strings means that 
><mailto:tex@%E6%99%AE%E9%81%8D%E6%8E%A5%E5%8F%97-%E6%B5%8B%E8%AF%95.%E4%B8%96%E7%95%8C>
>tex@普遍接受-测试.世界 would not become a link? Or http:// 普遍接受-测试.世界..com?
>
> 
>
>Highly restrictive means that latin cannot be mixed with Chinese or
>Japanese characters.
>
>
>Some script mixing *within* a label should be restricted as it is a
>security risk. Script mixing across a FQDN or between local part and
>host seem to be rather likely scenarios instead.
>
>For certain scripts, ASCII admixture (just ASCII, not all of Latin)
>would be common practice in the writing system and it may be common
>enough/benign enough to allow it.
>
>However, you might also want to address European digits for those
>scripts where native digits exist and are widely / predominantly used,
>vs. scripts where the native digits are more of historic/cultural
>interest. (In Arabic you have both, depending on the region).
>
>Mixing digit sets in the same label should be a no-no and indicated
>something's not well-formed. 
>
> 
>
>2.	I do not understand “Linkification should be determined by the
>implied intent of the user's entry” Is this intended to mean that the
>scheme (http, mailto, etc) should be added to form the link? Or some
>other determination of intent? If the former, it should be stated more
>clearly.
>
>
>My naive interpretation had to do with things like tables or data
>records where the purpose of a particular field would be a URL. 
>
> 
>
> 
>
>tex
>
> 
>
>From: ua-discuss-bounces at icann.org
>[mailto:ua-discuss-bounces at icann.org] 
>On Behalf Of Don Hollander
>Sent: Thursday, September 28, 2017 11:19 AM
>To: Universal Acceptance
>Subject: [UA-discuss] UASG010 - Quick Guide to Linkification
>
> 
>
>A quick update on Linkification
>
> 
>
>We have published an updated Quick Guide to Linkification
>https://uasg.tech/wp-content/uploads/2017/06/UASG010-Quick-Guide-to-Linkification.pdf
><https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuasg.tech%2Fwp-content%2Fuploads%2F2017%2F06%2FUASG010-Quick-Guide-to-Linkification.pdf&data=02%7C01%7Cmarksv%40microsoft.com%7Ce0ff7a322ef44f31a64208d507829df2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636423179950153243&sdata=gGQ5xhFTuJLf1kkFXWsOnfGdOH%2FMb0XbhK7tLiJfqzQ%3D&reserved=0>
>This builds on discussions we had post the UASG meeting in Seattle in
>April.
>
> 
>
>We are also working on an evaluation of Linkification in major Social
>Media Communication applications.   (Here’s the link to the Help Wanted
>advertisement - 
><https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuasg.tech%2Fwp-content%2Fuploads%2F2016%2F11%2FHelp-Wanted%25E2%2580%25A6-Linkification-Evaluation-1.0.pdf&data=02%7C01%7Cmarksv%40microsoft.com%7Ce0ff7a322ef44f31a64208d507829df2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636423179950153243&sdata=EuBYBxDZgX1FbvYLIbUE6RKuC4Jhk%2B7DQSrRa5guzH4%3D&reserved=0>
>Help Wanted: Linkification Evaluation)
>
> 
>
>This evaluation is being built on a replicable testing platform so that
>we can readily repeat the process in the future.   While early days, we
>expect to provide a preliminary report during the ICANN60 meeting.   As
>we go through the testing it is raising some additional questions about
>our Good Practice guide and expectations.  We fully expect that once
>the evaluation is completed we’ll again review UASG010 based on real
>world experiences.
>
> 
>
>Don
>
> 
>
> 
>
> 
>
>Don Hollander
>
>Universal Acceptance Steering Group
>
>Skype: don_hollander
>
> 
>
> 
>
> 
>
> 
>
> 

-- 
Sent from my Android device with BharatSync Communicator.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170930/968a9fa9/attachment.html>


More information about the UA-discuss mailing list