[Latingp] Stacking

Asmus Freytag (c) asmusf at ix.netcom.com
Wed Nov 6 03:17:28 UTC 2019


On 11/5/2019 5:54 PM, Bill Jouris wrote:
> Hi Asmus,
>
> Thanks for the quick response.  Comments inline below.
>
> Bill Jouris
> Inside Products
> bill.jouris at insidethestack.com
> 831-659-8360
> 925-855-9512 (direct)
>
>
> On Tuesday, November 5, 2019, 08:06:08 PM GMT-5, Asmus Freytag 
> <asmusf at ix.netcom.com> wrote:
>
>
> Bill,
>
> in continuation of our discussion, here's how I would have replied if 
> you had asked this in the session:
>
> First, you write: "The ideal solution, of course, would be for the 
> Unicode folks to create new pre-composed code points for these problem 
> cases. But I suspect there is little chance of them doing so before 
> our report is due. So, we will have to figure out an alternate 
> approach to recommend."
>
> Unicode has an explicit policy of not adding any more precomposed code 
> points for the kinds of combinations considered. So there's a definite 
> answer that such will not happen. Ever.
>
> >> Good to know what Unicode's policy is on this.  I wonder why, given 
> that they have a bunch of pre-composed code points which are not used 
> in any of what we fondly believe are the major languages using the 
> Latin alphabet.  Presumably they had their reasons for choosing the 
> ones that they did.
> >> If those reasons include indications that we missed a major 
> language or three that we should have included, that would be useful 
> to know ASAP.

==> The policy is simply not to encode anything that would get a 
canonical decomposition. Details in either the Unicode Standard Core 
Specification or UAX#15. There may also be an FAQ out there on 
Normalization. This forces vendors to support combinations if they want 
to support certain languages and you see that effect. The technology 
continues to get better.


>
> I would say that Courier New is perhaps an unfortunate choice of 
> reference font. Some other people may have more details, or actual 
> knowledge of what MSFT's plan is for that font, but it is my 
> impression that the Courier New font was state of the art in the past, 
> and it certainly looks like has not been maintained actively to cover 
> more languages (and frankly, I can't recall seeing it much recently).
>
> >> We'll need to discuss whether to shift to a different mono-width 
> font for out analysis.  On one hand, there would be a lot of work to 
> consider redoing -- which would take time that we probably don't have.


==> Correct, you do *not* have the time.

> On the other hand, there's something to be said for using the same set 
> of fonts throughout the analysis.  Perhaps we can decide that one 
> exception, for the non-pre-composed cases, is the least bad solution.  
> As I say, we'll have to thrash it out.
==> Correct, you can substitute a font in ongoing analysis (but you 
don't have much time for that, either - and the IP, being somewhat 
knowledgable in the Latin script, do not anticipate a large set of 
in-script variants. Most cases would surely be "confusables" and can be 
documented separately (such informative documentation can also be 
prepared during the public comment period).
>
> There are other more recent monowidth fonts such as Lucida Console. 
> See screen shot at the end. I've also appended the results for Segoe 
> UI which is the font used in my browser (Firefox on Windows7).
>
> >> Clearly we have been handicapped by none of us being expert in 
> which fonts are growing obsolete and which are more current.  As you 
> say, the universe of Latin fonts is enormous.  Clearly we couldn't 
> look at anything like the whole.  For example, we totally ignored all 
> the cursive-based fonts -- which would have, among other things, 
> generated a bunch more variants.  But we are where we are at the moment.


==> If you had told us that you were looking at cursive fonts, we would 
have probably had something to say about  - in our view that is taking 
the issue too far. by far.

>
> There's a near infinite universe of Latin-script fonts, and many do 
> not attempt to cover the entire script. If we include hyperlinks in 
> text (those showing the URL) there is no way we can predict which 
> fonts a user will see a domain name in.
>
> We have three choices here:
>
> (1) remove from the Latin LGR all code points/sequences not rendered 
> reliably in _any_ font
> (2) remove from the Latin LGR all code points/sequences not rendered 
> reliably in any "well-known" font
> (3) remove from the Latin LGR all code points/sequences not rendered 
> reliably in common user interface fonts: Windows, iOS, Android and all 
> browsers if they don't use platform fonts (latest version)
>
> Because of the way Latin-script fonts tend to subset, there's no way 
> that (1) is a reasonable choice in my view.
>
> >> Absolutely agree.  Or even possible.


==> Good.

>
> The problem with (2) is that some "well-known" fonts are tied to early 
> versions of a given platform and they *may* not be maintained any 
> longer - while some of them are still widely used, they have been 
> replaced for UI purposes by more modern / more capable fonts. 
> Effectively, they may be retained as legacy - so that you can still 
> view and edit documents that were created in them. Less well-known 
> fonts (such as Arial Unicode MS) may not have made the cut and aren't 
> routinely available any more. So the fact that a font is well-known 
> increases the likelihood that it is a legacy font. Taken together, 
> these considerations would argue against (2).
>
> >> As noted, we would need outside advice on which fonts are both 
> "well-known" and modern (i.e. not legacy) in order to attempt 2.


==> In that case, I think it's good we are not doing (2).

>
> That leaves (3) as a "reasonable" choice for making a cut. I know 
> you'd appreciate that choice of term :). It is also effectively 
> forward-looking, because more support tends to be added to newer 
> fonts/systems and that process looks like it would only continue.
>
> By all indication, more modern text fonts like Calibri, and modern UI 
> fonts like Segoe UI do not have issues with these code points, and I 
> simply can't imagine Google's Noto fonts would either.
>
> >> I'm not quite clear what you are recommending that we do here.  Are 
> you suggesting that we go back and redo using these three fonts?  Or 
> something else?

==> I think your absolute and unquestioning first priority is finishing 
a public draft. You have done enough work and have the feedback from the 
IP (and will get some more next week) to complete that task.

==> If, during public comment, somebody can demonstrate an issue using a 
recent phone, browser or OS, you can take corrective action in the final 
draft and remove some code points before publication. I don't expect you 
will find any cases, because the modern OSs and their users interface 
fonts are very good in handling combining sequences.

==> The recommendation follows from the way (3) is worded. It's not 
worded as "start a research project to find possible issues in an 
unspecified list of fonts, but conversely: act, if and when you have 
intelligence (from whatever source) of a clear defect."

>
> Looking at the screen shot in context with the reasoning above, it 
> seems to me that we are good, but if the Latin GP wants to document 
> the issue (that many Latin fonts do subset the range of code 
> points/glyphs/combinations that they support), that would be OK (if it 
> doesn't otherwise delay the project).
>
> >> OK, notwithstanding the above, I'm reading this as saying that
> 1) we can stick with the fonts we are using, and
> 2) we can continue including the combination glyphs that I was 
> concerned about, regardless of their issues in Courier New
>
> >> If that is not a correct understanding, please let me know.

==> Correct; at the moment you have no indication that the combining 
sequences should be rejected (item (3)) and for a quick check for 
variants your 3 fonts should be fine. If you could add Segoe UI as a 4th 
column in your tests, and can do that cheaply and quickly, go for it. 
But please come back telling use that _there are very few_ in-script 
variants beyond shwa and underlining. :)

> >> In Word (Windows 10), I get *ɛ̱̈*e rendered as problematic even in Lucida 
> Console -- although it renders fine in Firefox for email.  Just FYI. 
> Inconsistency among word processing softwares is a real pain, but one 
> we will probably never get away from.
>
>
>
==> Looks fine in Word in Windows 7. Go figure.

==> While Lucida Console is newer than Courier New, the use of monowidth 
fonts is problematic for so many reasons that basing the LGR design on 
their shortcomings is not something I would contemplate.

==> All we can do is hope that the technology is going forward.

==> Let's focus on getting this wrapped up as expeditiously as possible.

A./

>
> A./
> PS: I have blind copied the other IP members
>
> Screenshot:
> Instead of Arial, the screenshot shows Calibri in the left column, d

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20191105/080bd65f/attachment-0001.html>


More information about the Latingp mailing list