[vip] Suggested meta-questions to think about
Iliya Bazlyankov
iliya.bazlyankov at uninet.bg
Tue Jun 21 07:36:25 UTC 2011
Just a few comments:
>> A.2. Two different spellings of the same word in the same script and same language, like color/colour.
> This brings up the main issue I have with 'variants'. Again, my take is on Cyrillic. In my opinion, in Cyrillic the variants are label (word) based and not character based. In the Bulgarian language, there is direct 1:1 relationship between how an word is pronounced and how it is spelled. The Cyrillic script was originally designed so that it has this attribute. Therefore, there is rarely such cases. Where they exist, they are well known.
>
> I understand it has been a perception for years that we look at character variants, but for many scripts, there rarely have any meaning. Perhaps we should review this in depth for each individual script and possible threat different scripts differently.
>
In Bulgarian language we have the so called doublet forms of words, just
like color and colour. They are divided in many subgroups, from words
where only the accent is changed (and there is no visual changes, as in
Bulgarian the accent is written only in academic texts.) to the
ethymological doublets which have the same root but were accepted into
Bulgarian from different languages.
The most common doublet forms are the phonetical ones, but they are all
put in lists and as Daniel says, they are well known and I seriously
doubt that any of them could be used as a TLD. One example is the word
for "lunch" - can be written as обед or обяд.
I think if we start to review all the possible doublet cases just in
Bulgarian, we will miss the deadline, and therefore we have to put the
end to this somewhere and decide on what we should actually take, and
what we should leave outside of the project.
>> A.3. Same word in the same language in two different scripts (bulgarian)
> There is only one script for Bulgarian: Cyrillic. Will love to learn your source of this information.
>
> There are however issues as you describe with Serbian, where both Latin and Cyrillic are official scripts and there are examples of distorted words written in both scripts (we are not looking into this), or the same word written in both scripts, which is very common. Probably with other languages/scripts as well.
>
Since 2006 in Serbia the Cyrillic is the only official script, but the
Latin is widely used. (even more than the Cyrillic)
Iliya Bazlyankov
UNINET
More information about the vip
mailing list