[Latingp] How Authoritative Is Omniglot Really?

Bill Jouris bill.jouris at insidethestack.com
Thu Jul 4 14:05:25 UTC 2019


I'm OK with dealing with the issue via public comments and subsequent iterations of the Latin GP.  Just so we are doing so with our eyes open.  (Then again, perhaps the issue was obvious all along.  To everybody but me.)
Bill Jouris
Inside Products
bill.jouris at insidethestack.com
831-659-8360
925-855-9512 (direct) 

    On Thursday, July 4, 2019, 04:03:45 AM PDT, Mirjana Tasić <Mirjana.Tasic at rnids.rs> wrote:  
 
 #yiv5913759230 #yiv5913759230 -- _filtered #yiv5913759230 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv5913759230 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv5913759230 {panose-1:0 0 4 0 0 0 0 0 0 0;}#yiv5913759230 #yiv5913759230 p.yiv5913759230MsoNormal, #yiv5913759230 li.yiv5913759230MsoNormal, #yiv5913759230 div.yiv5913759230MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv5913759230 a:link, #yiv5913759230 span.yiv5913759230MsoHyperlink {color:#0563C1;text-decoration:underline;}#yiv5913759230 a:visited, #yiv5913759230 span.yiv5913759230MsoHyperlinkFollowed {color:#954F72;text-decoration:underline;}#yiv5913759230 p.yiv5913759230msonormal0, #yiv5913759230 li.yiv5913759230msonormal0, #yiv5913759230 div.yiv5913759230msonormal0 {margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv5913759230 span.yiv5913759230EmailStyle18 {font-family:sans-serif;color:windowtext;}#yiv5913759230 .yiv5913759230MsoChpDefault {font-size:10.0pt;} _filtered #yiv5913759230 {margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv5913759230 div.yiv5913759230WordSection1 {}#yiv5913759230 
Dear colleagues,
 
  
 
We cannot finish this task in one pass. Some form of Latin GP should exist for a while after we finish first LGR for Latin script. I completely agree with Michael’s and Meikal’s  discussion.
 
  
 
Regards Mirjana
 
From: Latin GP <latingp-bounces at icann.org> on behalf of Meikal Mumin <meikal at mumin.de>
Date: Thursday, July 4, 2019 at 12:23
To: Latin GP <latingp at icann.org>, Michael Bauland <Michael.Bauland at knipp.de>
Subject: Re: [Latingp] How Authoritative Is Omniglot Really?
 
  
 
Dear colleagues, 
 
  
 
I agree with Michael that a Public Comment Phase would be the most effective way to sort these things out, which is why I had suggested Public Comment Phases in different iterations, as ArabGP had done, e.g. one for the repertoire, one for the variants, and one for entire proposal.
 
  
 
All non-academic sources on the internet will be incomplete and without standard editing. There is no authoritative standard resource for orthographies, as there is for languages in the form of Ethnologue. Scriptsource is the best attempt yet, but remains were uneven to date.Unless we task a linguist to come up with authoritative descriptions of all these orthographies (which would be a substantial job) we have to take things at their face value, lest we happen to have a user of that orthography among us.
 
  
 
Best,

Meikal
 
Am 4. Juli 2019, 12:11 +0200 schrieb Michael Bauland <Michael.Bauland at knipp.de>:


 

Hi Bill,

On 03.07.2019 21:59, Bill Jouris wrote:


 

Dear colleagues, 

When we were developing our repertoire, our go-to reference for what
glyphs are used in any given language was Omniglot.  

At the ICANN meeting in Marrakech last week, I was talking to a group of
people about diacritics and such.  And I mentioned in passing that (as
shown in Omniglot https://www.omniglot.com/writing/spanish.htm ) the
only diacritic used in Spanish is the tilde over an N.  A couple of
native speakers of Spanish immediately corrected me, saying that the
acute and diaeresis are also used.  (A quick search with Google confirms
this.) 

The good news is, all of those glyphs are already in our repertoire.  So
no immediate problem there. 

The bad news, it seems to me, is this: in how many /other/ languages
does Omniglot fail to capture all of the diacritics or diacritic/letter
combinations actually used?  And how many of those result in glyphs
which are not in our repertoire currently?  (Which might resolve the
mystery of why Unicode has so many pre-composed combinations which we
didn't find.) 

I realize that answering that question necessarily involves going back
through the repertoire research process again.  Presumably using other
sources.  But I wonder if we can, in good conscience, fail to do so.
 


I agree with you that it is not unlikely that there may be further
errors in other Omniglot languages. I wouldn't be surprised if more
could be found.

The question is, what is the alternative? I can only speak for languages
that I know (English, German, Finnish). For these I can - with a high
degree of confidence - decide whether all glyphs have been included, but
not for the rest. So, even IF (and that's not a given) we find a better
source for our list of languages, who is to say that those lists of
glyphs are complete and correct. Those lists could also contain too
many/wrong glyphs.

Unless we find a native speaker for each of our languages who can list
us all glyphs (and even then, he/she can be mistaken, so we would
probably need at least three independent native speakers for each
language to get a reasonable degree of confidence), we will always have
the problem that whatever source we use, it may be incorrect.

My suggestion therefore is to go with the list we created and wait for
the public (or IP) comments. If someone complains and tells us we missed
a certain glyph, we of course have to and will add it.

I fear we have to get to a conclusion in the near future. It's like
writing a book: whenever you re-read it, you will most likely find
another problem or something to improve. It's almost impossible to get
it perfect. At some point you will have to decide whether you want to
publish the book (even if not 100% perfect) or continue improving it
until the end of days/ICANN. ;-)

Considering the fact that we're all volunteering our time here, I'd
rather come to a conclusion sooner than later. This does not mean that
if we find an actual error we shouldn't fix it. I want to submit
something that as far as we know is correct. However, we shouldn't spend
too much time searching for more potential errors at this point.

But that's of course only my personal opinion.

We can talk about this later today and get other opinions.

Michael

--
____________________________________________________________________
| |
| knipp | Knipp Medien und Kommunikation GmbH
------- Technologiepark
Martin-Schmeisser-Weg 9
44227 Dortmund
Germany

Dipl.-Informatiker Fon: +49 231 9703-0
Fax: +49 231 9703-200
Dr. Michael Bauland SIP: Michael.Bauland at knipp.de
Software Development E-mail: Michael.Bauland at knipp.de

Register Court:
Amtsgericht Dortmund, HRB 13728

Chief Executive Officers:
Dietmar Knipp, Elmar Knipp
_______________________________________________
Latingp mailing list
Latingp at icann.org
https://mm.icann.org/mailman/listinfo/latingp

_______________________________________________
By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.
 
_______________________________________________
Latingp mailing list
Latingp at icann.org
https://mm.icann.org/mailman/listinfo/latingp

_______________________________________________
By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20190704/60e8dd9d/attachment.html>


More information about the Latingp mailing list