[vip] Descriptive terminology

JFC Morfin jefsey at jefsey.com
Sat Sep 3 13:31:37 UTC 2011


At 13:08 03/09/2011, Cary Karp wrote:
>I urgently suggest that we expand our descriptive terminology with
>the term "homoglyph" to designate situations such as the one used in the
>Cyrillic/Latin illustration above.

No! :-)

Cary,

I fully understand your point and I support it. But not at the price 
of an added confusion. Glyphs are definitely out of our scope.
Characters are signs that are graphed. The way they are graphed is 
not an issue for computer protocols and registries.

The problem we meet here is that we use Unicode/ISO 10646 which 
distinguish between the graphed signs on non sign, non graph related 
premises. ISO10646/Unicode have cons and pros. One of these cons is 
to introduce a confusion in the use of some signs. To address this 
"unisoconfusable" characters issue we need an anti-homographic 
canonalization algorithm. This algorithm may based on unigraph 
(graphic signs) or unisign (general semiotic) tables or 
correspondances or on any other idea you might have.

In the current IUse work, we start from 63.000+ 16x16 or 16x8 bitmaps 
on an excel table. An immediate sort shows around 10.000 strictly 
unisoconfusable graphs (same bitmap). Our problem is to find a 
complete code point description table, fill it with bitmaps 
representations, work on their positionning (for exemple all of them 
locked in one of the four corners and centered), comparabilities from 
human indications, etc. and come-up with different tables 
corresponding to degrees of confusability and check the results from 
real operations experimentation.

Then the confusability algorithm should be amended from the 
experimentation inputs. Once we have obtained this, string 
confusability should be added through human inputs to IANA. this is 
why the happiana mailing list is concerned. The resulting registry 
may be quite important in size (and therefore in term of traffic) and 
the registration/validation process will be an industry issue and 
probably a perpetual battle if confusables are not also displayed in 
a cultural appropriate manner.

This method should then also be applicable to check logo confusability, etc.

jfc  



More information about the vip mailing list