[Latingp] Work Product Quality

Tan Tanaka, Dennis dtantanaka at verisign.com
Tue Feb 26 19:43:16 UTC 2019


Thanks and appreciate your suggestion. It’s a good plan forward.

If everyone agrees, these are the items to consider:

  *   Latin Script LGR Proposal, version 5: we need to complete the following items:

1.    Complete URL Underlining Analysis (Dennis)– Started
2.    Review “base character” analysis per assignments (Everyone); finish reviewing previous work – In progress
3.    Review input on German Eszett by German Registries (Michael) – Awaiting written report from Michael
4.    Write-up proposal to include 00B7 (Middle Dot) to support “l·l” (letter L + Middle Dot + letter L) for Catalan language (Michael) – Complete? Next step: document interaction with IP
5.    Develop test label strategy (Mats)
6.    Create test cases for dotless I vis-a-vis IDNA 2003 Compatibility issues (Mats)
7.    Complete section for IDNA2003 compatibility issues (Dennis)

  *   As far as maintaining a list of deferred cases, who volunteers to do this?


From: Latingp <latingp-bounces at icann.org> on behalf of Sarmad Hussain <sarmad.hussain at icann.org>
Date: Tuesday, February 26, 2019 at 5:54 AM
To: Latin GP <latingp at icann.org>
Subject: [EXTERNAL] Re: [Latingp] Work Product Quality

Dear Latin GP members,

May I suggest an option for the GP members to consider to keep moving forward – is it possible to split this discussion into two parts:

1.       The first would be to focus on the agreed upon methods, complete the remaining analysis and share the version of Latin LGR proposal for review by IP.

2.       In parallel, include any items which are not agreed by members in a separately maintained list, to be revisited for further discussion after completing step 1.  This may actually be a very small list.

This can help make progress while allowing to address any items the members consider may require more discussion.   IP may also be consulted for such cases.


From: Latingp <latingp-bounces at icann.org> On Behalf Of Tan Tanaka, Dennis via Latingp
Sent: Tuesday, February 26, 2019 4:38 AM
To: meikal at mumin.de; latingp at icann.org; bill.jouris at insidethestack.com
Subject: Re: [Latingp] Work Product Quality

The data was that we had votes from two users of languages, which make use of diaresis, both of which judged them confusable, namely me and Michael Bauland

Thus far, it is only your opinion of such confusability. We have not seen evidence that support your claims (again, other than your verbal statements). At the outset, this is fine to start looking at this case closer (which we have), but if we want to make a case for variant relationship on non-visual grounds, then let’s back it up with such evidence. All we heard on our last call was about comparing two characters on visual grounds using rare fonts. So, if we are making a visual argument, let’s use our visual methodology. If we want to support a variant relationship on semantic, interchangeability or other grounds we are going to be asked for this hard evidence. We haven’t seen it yet.

When it was suggested we do the analysis using wordmark it, the thesis was to find out -among several font types- whether hand-writing conventions were transferred to font design. In the case of “a” with diaresis and double acute, we saw there is no confusion; a diaresis is always a diaresis and a double acute is a double acute.

What I suggest the panel do is remain open to the possibility (I would say the certainty) that we have not yet created methodologies which will identify all variants.

                But we define what variants are, not the other way around.

We designed a method for visual analysis. Is it limited? Yes. But we chose widely-used, stable font types reflecting different styles to capture alternative renderings. We accepted this design knowing its shortcomings. I don’t think it is productive to challenge this decision at this point in time.

Let us remember that we need to focus on clear-cut cases and leave subjectivity to the visual similarity process.


From: Latingp <latingp-bounces at icann.org<mailto:latingp-bounces at icann.org>> on behalf of Meikal Mumin <meikal at mumin.de<mailto:meikal at mumin.de>>
Date: Monday, February 25, 2019 at 1:57 PM
To: Latin GP <latingp at icann.org<mailto:latingp at icann.org>>, Bill Jouris <bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>>
Subject: [EXTERNAL] Re: [Latingp] Work Product Quality

Dear colleagues,

To chime into this discussion - I tend to agree with Bill in that the data is always more important the theory. Our methodologies are based on certain hypotheses. If these turn out to be insufficient, the solution is not to ignore the data but to amend the methodology and change the hypothesis and those are very basic principles of scientific analysis and methods. Nonetheless I find this last exchange of emails a bit unspecific and would suggest focusing on the actual data, rather than having abstract methodological debates:

I assume that Bill’s email makes reference to the discussions we had regarding the cases of double diaresis vs. double acute during our last teleconference. If I remember correctly the gist of the line of argumentation by Dennis was that us considering them variants was not possible, because our reasons were based on visual similarity, while we were discussing non-visual similarity at this time.

The data was that we had votes from two users of languages, which make use of diaresis, both of which judged them confusable, namely me and Michael Bauland. We had several examples from fonts where they were near-homoglyphs if not homoglyphs to use established terminology (without implying a categrorial notion of visual vs. non-visual similarity). Now Dennis argued that if we would make them variants because they were near-homoglyphs or homoglyphs, we would have to restrain ourselves to the rendering in the same three fonts we had opted for earlier in our analysis of cross-script visual variants.

In this context, I would like to remind the group of two facts:

1) We have not yet conducted an analysis of in-script variants on strictly visual grounds. With cross-script variants we started with a visual analysis, and then during that work discovered that we had to consider further non-visual criteria. This led us to developing the current theories and framework of how to identify non-visual criteria. We went on to amend our visual analysis in the case of cross-script variants and then proceed with an analysis of in-script variants.

Accordingly, we haven’t had an analysis of in-script variants on strictly visual grounds as of yet. So, either that task remains, or we do a combined pass analyzing both aspects together, and the latter seems the obvious choice to me.

2) As I tried to explain in the proposal itself, there is no systematic, non-arbitrary boundary between visual and non-visual similarity - One conditions the other and vice versa. Therefore, in my opinion we should not try to enforce such a boundary, neither from the point of analysis nor from the point of methodology.

We started with a visual analysis of cross-script variant candidates because that is what was expected from us, as clarified by the Integration Panel on various occasions. However, we cannot abuse these statements to deduct that - by consequence - other things are out of scope and I’m confident that an incomplete analysis will not be accepted by the Integration Panel.

Therefore, the relevant question in our discussion will be if there is a succinct risk for stability, rather that which type of variant relationship a particular case falls into and whether we applied the correct methodology.

In the case under discussion I interpret the data to substantiate such a risk for stability simply because users would confuse them for whatever reasons. We should however stick to discussing the data rather than arguing over methodology, because it is such arguments which keep us from making progress, not the quality or quantity of the data.

I hope this summary will speed up our discussion during our next tele-conference, but even if it does not settle the discussion it is important that we *cannot* rule out any analysis because of any methodology developed before analyzing the data.


Am 23. Feb. 2019, 18:44 +0100 schrieb Bill Jouris <bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>>:

I'm not saying that the methodologies we have developed are flawed.  I am saying that they are limited.

There is some set of code points which are variants.  Each methodology we have can identify some subset of those variants -- but not all of them.  That is, after all, why we have developed multiple methodologies.  Our task, I submit, is not to process our methodologies; our task is to identify variants.  The methodologies are a means to that end; they are not the end itself.

I believe we need to bear in mind that the methodologies which we have at this point may still not be sufficient to identify all of the variants which exist.  We need to accept that, if we find code points which we generally agree are variants, but which are not identified by our existing methodologies, that is not a bad thing.  It would be good to then identify why they are variants, in the interests of then finding others of the same type; to develop an additional methodology.  Good, but not critical.

The quality of our work product is determined by how successfully we identify the variants which exist.  If we find variants which are not generated by our methodologies, that improves the quality of our work.  It does not, as you appeared to suggest, diminish it.

What I suggest the panel do is remain open to the possibility (I would say the certainty) that we have not yet created methodologies which will identify all variants.  At some point, of course, we do have to stop.  But that doesn't justify just saying "These code points do not fit the existing methodologies, so we cannot even consider them."   Again, the methodologies are a tool, not an end.

Bill Jouris
Inside Products
bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>
925-855-9512 (direct)

On Friday, February 22, 2019, 10:23:19 AM PST, Tan Tanaka, Dennis <dtantanaka at verisign.com<mailto:dtantanaka at verisign.com>> wrote:


I don’t see how the methods this panel have developed and signed-off on (some of them over a year ago) misalign with our goal.

What exactly are you suggesting this panel to do?


From: Latingp <latingp-bounces at icann.org<mailto:latingp-bounces at icann.org>> on behalf of Bill Jouris <bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>>
Reply-To: Bill Jouris <bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>>
Date: Thursday, February 21, 2019 at 5:36 PM
To: Latin GP <latingp at icann.org<mailto:latingp at icann.org>>
Subject: [EXTERNAL] [Latingp] Work Product Quality

Dear colleagues,

I want to strongly disagree with the thesis put forward at this morning's meeting that the quality of our work product depends on our following some very small number of discrete methodologies for identifying variants.  It does not.

A methodology, any methodology, is merely a tool.  Its purpose is to help us get to our goal -- identifying variants.  If we use multiple methodologies, each of which allows us to identify some (quite possibly overlapping) subset of the universe of variants, that's OK.  If we identify a few additional cases that we believe to be variants, while going outside those methodologies, that's OK too.   It is desirable to have methodologies which identify large subsets, but only because that expedites our work.  The methodologies are a means to an end, not the end in themselves.

The quality of our work product is based on how successfully we identify the members of the underlying universe of variants.  How we get there is really irrelevant to the quality of what we produce.

Bill Jouris

Sent from Yahoo Mail on Android [go.onelink.me]<https://urldefense.proofpoint.com/v2/url?u=https-3A__go.onelink.me_107872968-3Fpid-3DInProduct-26c-3DGlobal-5FInternal-5FYGrowth-5FAndroidEmailSig-5F-5FAndroidUsers-26af-5Fwl-3Dym-26af-5Fsub1-3DInternal-26af-5Fsub2-3DGlobal-5FYGrowth-26af-5Fsub3-3DEmailSignature&d=DwMGaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=KTETvEaGPwPcawI-QmNa-kiv-ZBvdgyyLm-mxd028M4&m=UzBRHrOObz2ODyc0MW_NuMVBMNhacRrliymOyQoUtLI&s=_FpsBtEG5lcsg6W72cXbKlXIXOzwvAhNtC1nZ1_QW5k&e=>
Latingp mailing list
Latingp at icann.org<mailto:Latingp at icann.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20190226/8b49779f/attachment-0001.html>

More information about the Latingp mailing list