[Latingp] How Authoritative Is Omniglot Really?

Meikal Mumin meikal at mumin.de
Thu Jul 4 10:22:47 UTC 2019


Dear colleagues,

I agree with Michael that a Public Comment Phase would be the most effective way to sort these things out, which is why I had suggested Public Comment Phases in different iterations, as ArabGP had done, e.g. one for the repertoire, one for the variants, and one for entire proposal.

All non-academic sources on the internet will be incomplete and without standard editing. There is no authoritative standard resource for orthographies, as there is for languages in the form of Ethnologue. Scriptsource is the best attempt yet, but remains were uneven to date.Unless we task a linguist to come up with authoritative descriptions of all these orthographies (which would be a substantial job) we have to take things at their face value, lest we happen to have a user of that orthography among us.

Best,

Meikal
Am 4. Juli 2019, 12:11 +0200 schrieb Michael Bauland <Michael.Bauland at knipp.de>:
> Hi Bill,
>
> On 03.07.2019 21:59, Bill Jouris wrote:
> > Dear colleagues,
> >
> > When we were developing our repertoire, our go-to reference for what
> > glyphs are used in any given language was Omniglot.
> >
> > At the ICANN meeting in Marrakech last week, I was talking to a group of
> > people about diacritics and such.  And I mentioned in passing that (as
> > shown in Omniglot https://www.omniglot.com/writing/spanish.htm ) the
> > only diacritic used in Spanish is the tilde over an N.  A couple of
> > native speakers of Spanish immediately corrected me, saying that the
> > acute and diaeresis are also used.  (A quick search with Google confirms
> > this.)
> >
> > The good news is, all of those glyphs are already in our repertoire.  So
> > no immediate problem there.
> >
> > The bad news, it seems to me, is this: in how many /other/ languages
> > does Omniglot fail to capture all of the diacritics or diacritic/letter
> > combinations actually used?  And how many of those result in glyphs
> > which are not in our repertoire currently?  (Which might resolve the
> > mystery of why Unicode has so many pre-composed combinations which we
> > didn't find.)
> >
> > I realize that answering that question necessarily involves going back
> > through the repertoire research process again.  Presumably using other
> > sources.  But I wonder if we can, in good conscience, fail to do so.
>
> I agree with you that it is not unlikely that there may be further
> errors in other Omniglot languages. I wouldn't be surprised if more
> could be found.
>
> The question is, what is the alternative? I can only speak for languages
> that I know (English, German, Finnish). For these I can - with a high
> degree of confidence - decide whether all glyphs have been included, but
> not for the rest. So, even IF (and that's not a given) we find a better
> source for our list of languages, who is to say that those lists of
> glyphs are complete and correct. Those lists could also contain too
> many/wrong glyphs.
>
> Unless we find a native speaker for each of our languages who can list
> us all glyphs (and even then, he/she can be mistaken, so we would
> probably need at least three independent native speakers for each
> language to get a reasonable degree of confidence), we will always have
> the problem that whatever source we use, it may be incorrect.
>
> My suggestion therefore is to go with the list we created and wait for
> the public (or IP) comments. If someone complains and tells us we missed
> a certain glyph, we of course have to and will add it.
>
> I fear we have to get to a conclusion in the near future. It's like
> writing a book: whenever you re-read it, you will most likely find
> another problem or something to improve. It's almost impossible to get
> it perfect. At some point you will have to decide whether you want to
> publish the book (even if not 100% perfect) or continue improving it
> until the end of days/ICANN. ;-)
>
> Considering the fact that we're all volunteering our time here, I'd
> rather come to a conclusion sooner than later. This does not mean that
> if we find an actual error we shouldn't fix it. I want to submit
> something that as far as we know is correct. However, we shouldn't spend
> too much time searching for more potential errors at this point.
>
> But that's of course only my personal opinion.
>
> We can talk about this later today and get other opinions.
>
> Michael
>
> --
> ____________________________________________________________________
> | |
> | knipp | Knipp Medien und Kommunikation GmbH
> ------- Technologiepark
> Martin-Schmeisser-Weg 9
> 44227 Dortmund
> Germany
>
> Dipl.-Informatiker Fon: +49 231 9703-0
> Fax: +49 231 9703-200
> Dr. Michael Bauland SIP: Michael.Bauland at knipp.de
> Software Development E-mail: Michael.Bauland at knipp.de
>
> Register Court:
> Amtsgericht Dortmund, HRB 13728
>
> Chief Executive Officers:
> Dietmar Knipp, Elmar Knipp
> _______________________________________________
> Latingp mailing list
> Latingp at icann.org
> https://mm.icann.org/mailman/listinfo/latingp
>
> _______________________________________________
> By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20190704/463f1f27/attachment.html>


More information about the Latingp mailing list