<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Segoe UI Emoji";
panose-1:2 11 5 2 4 2 4 2 2 3;}
@font-face
{font-family:Candara;
panose-1:2 14 5 2 3 3 3 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:windowtext">Dear Dr. Lehal, All,</span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">Thank you for sharing the updated LGR proposal for Gurmukhi script. Integration panel is currently reviewing it and developing the feedback document.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">In the meantime, they have run a corpus of Punjabi in Gurmukhi script with the test results attached and summarized below. In the summary, IP has identified some cases which show invalid labels with a slightly
high percentage (in </span><span style="color:red">red </span><span style="color:windowtext">below). You can review the actual labels in the data file attached, which is marked up accordingly.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">The IP would like to share this data and the summary below with the NBGP for the GP to reconfirm that the failing labels should actually fail - and it is not the case that the indicated rules are too restrictive.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">We aim to share the IP feedback document next week. Please let us know if you have any questions.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">Regards,<br>
Sarmad<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">=============<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p><span style="font-family:"Candara",sans-serif">Corpus: <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_unicode-2Dorg_unilex_tree_master_data_frequency&d=DwMDaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=KTETvEaGPwPcawI-QmNa-kiv-ZBvdgyyLm-mxd028M4&m=AJsOy7J0b8rICy7w2ks4x8ScEzkwHhaAz9NnbMjvZOc&s=VI9VuAXqLfgAs12WyNhbN7PW4Mi2rjf26DV4G7HrZcw&e=">
https://github.com/unicode-org/unilex/tree/master/data/frequency [github.com]</a></span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">Full Test results attached.</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">A./</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">SUMMARY</span><o:p></o:p></p>
<p style="margin-bottom:12.0pt"><span style="font-family:"Candara",sans-serif"> Total Labels processed: 171388 of which<br>
valid labels: 163289<br>
invalid labels: 7391<br>
skipped labels: 708 of which<br>
duplicate labels: 21<br>
broken labels: 11 </span><span style="font-family:"Candara",sans-serif;color:#6666CC"><-- rejected by IDN library as not NFC or other malformed</span><span style="font-family:"Candara",sans-serif"><br>
contain join controls: 287 </span><span style="font-family:"Candara",sans-serif;color:#6666CC"><-- are these stylistic or orthographic?</span><span style="font-family:"Candara",sans-serif"><br>
start w/ wrong script: 389 </span><span style="font-family:"Candara",sans-serif;color:#6666CC">(contamination)</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">Number of invalid labels by reason:<br>
4742 instances of not in repertoire<br>
173 instances of out-of-repertoire variant<br>
167 instances of invalid context (Follows-only-specific-V-or-M) 0.1%<br>
238 instances of invalid context (Follows-only-C-or-N-and-precedes-only-C2) 0.15%<br>
285 instances of invalid context (Follows-only-C-N-or-specific-V-or-M) 0.17%<br>
61 instances of invalid context (Follows-only-C1)<br>
833 instances of invalid context (Follows-only-C-or-N) </span><span style="font-family:"Candara",sans-serif;color:red"> 0.5%</span><span style="font-family:"Candara",sans-serif"><br>
892 instances of invalid context (Follows-only-C-N-or-specific-V-or-M-and-precedes-only-C3-or-specific-CN)
</span><span style="font-family:"Candara",sans-serif;color:red">0.6%</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif;color:red">** rough indication of percentage; higher percentage failures may indicate either that certain typos are common or that</span><span style="font-family:"Candara",sans-serif"><br>
</span><span style="font-family:"Candara",sans-serif;color:red">** a rule is too restrictive. The following example shows some the contexts detected for one of the rules - for more detail<br>
** and actual labels see attached.</span><o:p></o:p></p>
<p> Contexts not matching rule "Follows-only-C-or-N":<br>
[:Bindi:] <span style="font-family:"Segoe UI Emoji",sans-serif">⚓</span>=[:Matra:]<br>
[:Matra:] <span style="font-family:"Segoe UI Emoji",sans-serif">⚓</span>=[:Matra:]<br>
[:Tippi:] <span style="font-family:"Segoe UI Emoji",sans-serif">⚓</span>=[:Matra:]<br>
[:Vowel:] <span style="font-family:"Segoe UI Emoji",sans-serif">⚓</span>=[:Matra:]<br>
<span style="font-family:"Candara",sans-serif"><br>
<br>
<b>Test Label Coverage:</b><br>
Repertoire (code points): 56 of 56. {0A02 0A05-0A0A 0A0F-0A10 0A13-0A28 0A2A-0A30 0A32 0A35 0A38-0A39 0A3C 0A3E-0A42 0A47-0A48 0A4B-...}<br>
Repertoire not covered: 0 of 56. {}<br>
Out of Repertoire: 80. [{0027 002E 0030-003A 0061-0062 0064-0065 0067 0069-006A 006C 0070 0073 0075 0078 00E0 00E2 00ED-00EE 0901-0902 0906-0909 090F 0913 0915-0918 091A-091D 091F-0924 0926-0928 092A 092C-0930 0932 0935-0939 093C 093E-0942 0947-0948
094B 094D </span><span style="font-family:"Candara",sans-serif;color:red">0A6B</span><span style="font-family:"Candara",sans-serif">
</span><span style="font-family:"Candara",sans-serif;color:red">0A72-0A74</span><span style="font-family:"Candara",sans-serif">}]
</span><span style="font-family:"Candara",sans-serif;color:#6633FF"><-- excluded code points highlighted</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">Tag Values: 12 of 12.<br>
Addak<br>
Bindi<br>
C1<br>
C2<br>
Consonant<br>
M1<br>
Matra<br>
Nukta<br>
Tippi<br>
V1<br>
Virama<br>
Vowel<br>
Named Classes: 13 of 13.<br>
A<br>
B<br>
C<br>
C1<br>
C2<br>
C3<br>
M<br>
M1<br>
M2<br>
N<br>
V<br>
V1<br>
V2</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-family:"Candara",sans-serif">Context Rules matched: 6 of 6.</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif"> Follows-only-C-or-N-and-precedes-only-C2<br>
Follows-only-C-or-N<br>
Follows-only-specific-V-or-M<br>
Follows-only-C-N-or-specific-V-or-M-and-precedes-only-C3-or-specific-CN<br>
Follows-only-C1<br>
Follows-only-C-N-or-specific-V-or-M</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">Context Rules failed: 6 of 6.<br>
Follows-only-C-N-or-specific-V-or-M-and-precedes-only-C3-or-specific-CN<br>
Follows-only-specific-V-or-M<br>
Follows-only-C-or-N<br>
Follows-only-C-or-N-and-precedes-only-C2<br>
Follows-only-C-N-or-specific-V-or-M<br>
Follows-only-C1</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">When Rules defined: (required context)<br>
Follows-only-specific-V-or-M<br>
Follows-only-C1<br>
Follows-only-C-or-N<br>
Follows-only-C-or-N-and-precedes-only-C2<br>
Follows-only-C-N-or-specific-V-or-M<br>
Follows-only-C-N-or-specific-V-or-M-and-precedes-only-C3-or-specific-CN</span><o:p></o:p></p>
<p><span style="font-family:"Candara",sans-serif">Not-When Rules defined: (prohibited context)<br>
(none)</span><o:p></o:p></p>
<p><o:p> </o:p></p>
</div>
</body>
</html>