[CPWG] ME community satetment about the ICANN Open data prlatform

Chokri Ben Romdhane chokribr at gmail.com
Fri Feb 4 13:36:06 UTC 2022


Thank you Hadia for your permanent support.

Thank you John for the great point  and I totally agree with your point
that a consensual standard reports (Data) structure (Formats) may be
adopted by contracted parties in order to facilitate data exchange and/or
Integration between systems.
Note that current trends are for the use of JSON or XML formats in order to
exchange Data rather than CSV format.
Note also that With the Rest API, datasets can be locally downloaded in
order to  be used by any Software Development Kit and/or used remotely.

Friendly
Chokri



Le jeu. 3 févr. 2022 à 19:55, John McCormac via CPWG <cpwg at icann.org> a
écrit :

> On 03/02/2022 15:26, Chokri Ben Romdhane via CPWG wrote:
> > Dear Friends,
> > During the ICANN72 ME space session
> > <https://72.schedule.icann.org/meetings/ir8CyynKdp3GwtbsY> , we
> > submitted a statement
> > <https://drive.google.com/file/d/1ZRqAXPrjcU1B9v_6ZwSdHoF-SgLjj_65/view>
>
> > to the board about the ICANN Open Data Platform, and we received the
> > following answers
> > <
> https://drive.google.com/file/d/1OoiqWDS7pkT_EplN5J_izfzSQj7aBJY-/view?usp=sharing>
>
> > from the Board.
>
> In the presentation given, I thnk that Ashwin Rangan may have been
> unaware of the issues with the ODP when it came to the per-registrar
> data. The problems with the per-registrar transactions were mainly that
> the importation of the CSV files into the ODP was not a simple process
> due to missing data, corrupted data and differing formats in the CSV files.
>
> The limitation of the ODP in handling what are effectively trivial
> datasets is disturbing. With the expansion of the numbers of gTLDs and
> subsequent rounds, the ODP, with a limited dataset licence, would
> quickly be of limited value. That should have been immediately obvious
> to ICANN.
>
> The retention of CSVs in parallel with the ODP is the best strategy.
> This is because the CSV is a more robust format and errors are much
> easier to identify. This is how it was possible to identify the problems
> with the per-registrar data.
>
> There is a serious normalisation problem with the per-registrar data in
> that some registries have their own names for the registrars. The
> language for the column headers issue is a relatively simple issue with
> a properly designed database schema but I am not sure how the ODP could
> handle multiple languages. I tried subscribing to the ICANN ME mailing
> list after the presentation.
>
> Though the ODP is a useful tool, it is lacking historical depth. Some of
> this is due to data formats and data being in PDF format (which varied
> from registry to registry) rather than CSV. I successfully
> reverse-engineered and extracted the data from most of these PDFs back
> to 2006 for some gTLDs to build a database of historical per-registrar
> transactions. It was an interesting exercise.
>
> The formatting in the PDFs varied. Some of the data (deletion figures)
> for .COM and .NET was missing from the per-registrar reports until
> Verisign adopted the new reporting format. There were some other data
> quality issues that have persisted The .AFRICA per-registrar reports
> have been missing the new-adds and renews data and have been so since
> the gTLD launched. The latest (October 2021) report for the gTLD is
> still missing this data.
>
> The ODP offers a useful interface for dealing with the data but the best
> application would be one in Python, Ruby or other programming language
> to download datasets to be processed locally. The database schema for
> the per-registrar reports is standardised so it is easy enough to load
> this data into a database with a single statement. The schema for the
> other datasets is also available on the ODP, I think.
>
> Regards...jmcc
> --
> **********************************************************
> John McCormac  *  e-mail: jmcc at hosterstats.com
> MC2            *  web: http://www.hosterstats.com/
> 22 Viewmount   *  Domain Registrations Statistics
> Waterford      *  Domnomics - the business of domain names
> Ireland        *  https://amzn.to/2OPtEIO
> IE             *  Skype: hosterstats.com
> **********************************************************
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> _______________________________________________
> CPWG mailing list
> CPWG at icann.org
> https://mm.icann.org/mailman/listinfo/cpwg
>
> _______________________________________________
> By submitting your personal data, you consent to the processing of your
> personal data for purposes of subscribing to this mailing list accordance
> with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and
> the website Terms of Service (https://www.icann.org/privacy/tos). You can
> visit the Mailman link above to change your membership status or
> configuration, including unsubscribing, setting digest-style delivery or
> disabling delivery altogether (e.g., for a vacation), and so on.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/cpwg/attachments/20220204/7300a4a9/attachment.html>


More information about the CPWG mailing list