[vip] Suggested meta-questions to think about

Iliya Bazlyankov iliya.bazlyankov at uninet.bg
Tue Jun 21 07:36:25 UTC 2011


Just a few comments:

>> A.2. Two different spellings of the same word in the same script and same language, like color/colour.
> This brings up the main issue I have with 'variants'. Again, my take is on Cyrillic. In my opinion, in Cyrillic the variants are label (word) based and not character based. In the Bulgarian language, there is direct 1:1 relationship between how an word is pronounced and how it is spelled. The Cyrillic script was originally designed so that it has this attribute. Therefore, there is rarely such cases. Where they exist, they are well known.
>
> I understand it has been a perception for years that we look at character variants, but for many scripts, there rarely have any meaning. Perhaps we should review this in depth for each individual script and possible threat different scripts differently.
>
In Bulgarian language we have the so called doublet forms of words, just 
like color and colour. They are divided in many subgroups, from words 
where only the accent is changed (and there is no visual changes, as in 
Bulgarian the accent is written only in academic texts.) to the 
ethymological doublets which have the same root but were accepted into 
Bulgarian from different languages.
The most common doublet forms are the phonetical ones, but they are all 
put in lists and as Daniel says, they are well known and I seriously 
doubt that any of them could be used as a TLD. One example is the word 
for "lunch" - can be written as обед or обяд.

I think if we start to review all the possible doublet cases just in 
Bulgarian, we will miss the deadline, and therefore we have to put the 
end to this somewhere and decide on what we should actually take, and 
what we should leave outside of the project.
>> A.3. Same word in the same language in two different scripts (bulgarian)
> There is only one script for Bulgarian: Cyrillic. Will love to learn your source of this information.
>
> There are however issues as you describe with Serbian, where both Latin and Cyrillic are official scripts and there are examples of distorted words written in both scripts (we are not looking into this), or the same word written in both scripts, which is very common. Probably with other languages/scripts as well.
>
Since 2006 in Serbia the Cyrillic is the only official script, but the 
Latin is widely used. (even more than the Cyrillic)

Iliya Bazlyankov
UNINET


More information about the vip mailing list