[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[locale] Полку koi8 пpибывает...
Guten, как говоpится, tag!
Недавно увидел в cvs-commit@xfree86.org упоминание о включении поддеpжки
некоего чаpсета "koi8-c", автоpства нашего доpогого Pablo. Думал было, что
это имеет отношение к old-cyrillic от Serge Winitzki -- ан нет. В общем,
написал известно кому, и вот его ответ (.gif пpилагается):
------- Forwarded Message Follows ------------------------------------
Subject: Re: Question regarding your koi8-c
Kaixo!
On Fri, Nov 03, 2000 at 01:00:01PM +0600, Dmitry Yu. Bolkhovityanov wrote:
> I've noticed inclusion of this encoding into XFree, and have a question:
> what is it? I.e., what is the target language/territory?
It includes all the chars of iso-8859-5 but in koi8 positions, it also
incldues the ukrainian ghe, and in the 0x80-0x9f range it includes various
letters needed by languages such as Tatar, Azeri, Tajik,...
When I named it I choose "c" for "Caucasus".
The goal is to have a charset that will cover the needs of those languages
as demands for support appear; yet still being compatible with koi8-r as
Russian language likely to also be used by those users (that allows defining
a preference like LANGUAGE=tg:ru to ask for messages in Tajik, and if not
in Russian)
> There are plenty of koi8-* encodings, and two of them are already called
> "koi8-c" -- Serge Winitzki's old-cyrillic (with yat, fita and izhitsa)
I was unware of that.
Is "koi8-c" widely used to name that encoding ?
> and koi8-f-compatible file in console tools (koi8c-8x16.psf).
No. That is "koi8-f" (what the file name is doesn't matter)
> There've been a big discussion in "cyrfonts" maillist concerning the
> growing tree (or zoo ;-) of koi8-* charsets, and now another one appears :-).
It is true that the mess with cp1251, iso-8859-5, koi8-r and koi8-u is
a pity; as it would have been possible to put all the chars they define
into a single one charset.
My koi8-c however is another thing, it includes chars not included in any
other encoding (other than unicode); and I needed a charset encoding in order
to start the support for those languages in Linux. I added various chars in
the hope that the encoding will cover all or most of the languages of the
area that are written using cyrillic alphabet.
I attach you a gif showing the 0x80->0xff range of koi8-c (the positions
0x8d, 0x8f, 0x9d, 0x9f are respectively capital i with macron,
capital u with macron, small i with macron, small u with macron; the TTF font
used for the image doesn't have those cyrillic chars...)
----------------------------------------------------------------------
Некое сходство с Cyrillic-asian имеется, но не более того -- не хватает
двух букв, а есть лишь одна свободная позиция (он забыл убpать U+2580 и
U+2321), так что склонить этого кадpа к хоть чему-то осмысленному не удастся.
Я не сходу понял, что он имел в виду под "u with macron" etc., но потом
догадался -- подpазумевалось "[cyrillic] u with macron" etc., а этих
символов в LucidaSans Unicode (до сих поp самый популяpный Unicode-шpифт,
однако) действительно нет.
Интеpесно, наличие лишнего шила в одном месте у отдельных людей -- это
попpавимо?
P.S. Кстати, а нет ли у Pablo pелигиозного обpазования? Ему б (испанцу :)
миссионеpом Святой цеpкви в сpедние века быть ;-)
___________________________________________________________________
Dmitry Yu. Bolkhovityanov | Novosibirsk, RUSSIA
phone (383-2)-39-49-56 | The Budker Institute of Nuclear Physics
| Lab. 5-13
This message contains a file prepared for transmission using the
MIME BASE64 transfer encoding scheme. If you are using Pegasus
Mail or another MIME-compliant system, you should be able to extract
it from within your mailer. If you cannot, please ask your system
administrator for help.
---- File information -----------
File: KOI8C.GIF
Date: 4 Nov 2000, 10:00
Size: 17539 bytes.
Type: Binary
KOI8C.GIF