[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[locale] Fw: unscientific charset survey
Hi!
JFYI :
Но мне кажется, данные не совсем чистые, так как в ньсах
(NNTP) практически не встречается Charset=Windows-1251
и процент KOI8-R максимален. Как известно, гейт www.fido7.ru
пропускает только KOI8-R.
--
-=AV=-
-----Original Message-----
From: Erland Sommarskog <sommar@algonet.se>
To: usefor@rkive.landfield.com <usefor@rkive.landfield.com>
Date: 11 марта 2001 г. 1:25
Subject: unscientific charset survey
This might be of some interest for this group:
From: "Eric A. Hall" <ehall@ehsco.com>
Newsgroups: comp.std.internat,comp.mail.mime,comp.mail.headers
Subject: unscientific charset survey
Date: 06 Mar 2001 17:46:21 GMT
Organization: EHS Company
Lines: 26
Message-ID: <3AA52268.F6A76652@ehsco.com>
NNTP-Posting-Host: 209.31.7.42
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.75 [en] (WinNT; U)
X-Accept-Language: en
I needed some charset distribution numbers and couldn't find any, so I
pointed a perl script at my ISP's news server.
4,024,487 messages were processed.
3,389,401 (84%) had no charset defined.
632,680 (16%) had legal charsets or aliases defined.
2,406 (.05%) had illegal charsets defined.
The following had more than 1,000 matches:
ASCII 400,291
ISO-8859-1 177,786
ISO-8859-2 25,704
KOI8-R 10,228
ISO-2022-JP 7,677
Windows-1252 4,718
BIG5 2,502
UTF-8 1,616
ISO-8859-15 1,064
Raw data and charts at http://www.ehsco.com/opinion/20010305.html
--
Eric A. Hall http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
--
Erland Sommarskog, Stockholm, sommar@algonet.se