[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fw: native encoding Font support in CJK PS printing
Hi!
Иногда полезно знать японский взгляд на вещи... ;)
Да, конференция тоже неплохая.
-----Original Message-----
From: Masaki Katakai <katakai@sun.co.jp>
Newsgroups: netscape.public.mozilla.i18n
To: mozilla-i18n@mozilla.org <mozilla-i18n@mozilla.org>
Date: 25 ?? 2000 ? 19:34
Subject: native encoding Font support in CJK PS printing
Hi,
It seems that my original mail could not be sent to this
mailing list due to the large attachment(?). I'm very sorry.
I have prepared URLs for the examples and screen shots.
I'm working on native PostScript font and native encoding
support for Mozilla CJK printing. The code change has not been
finished yet, but I'd like to explain my approach here and want to
know your comments and suggestions.
Mozilla's CJK printing on Unix platform now doesn't work by just
clicking Print button. The default behavior requires the
post-processing that external filters and external TrueType fonts
because the outputs use unicode code point and don't use native
encoding PostScript fonts. When we consider users who have native
PostScript printer or who have installed native PostScript fonts
into ghostscript, it will be much inconvenient to configure
(download filter and large TrueType fonts, and install the files,
etc.) and do the post-processing.
In Japan, japanese PostScript printers are widely used. Also
ghostscript of major Linux distribution already contains japanese
native PostScript fonts. Most japanese users would not need the
post-processing. Most users want printing just by click on Print
button. Also the native PostScript fonts support is being
supported in Netscape 4.x. So Mozilla should support this feature.
Here is my approach, which is same as Netscape 4.x, but I try to
add unicode CID font support. This shouldn't break the existing
external filters and, of course, shouldn't break ascii printing. This
will apply to the cases when the charset encoding of document is
not "utf-8", such as "euc-jp" and "euc-kr".
* Native ps fonts and its encoding are defined in prefs.js
Taking the same approach as Netscape 4.x.
user_pref("print.psnativeencoding.<charset>","<target_charset>");
user_pref("print.psnativefont.<charset>","<fontname>");
For example, prefs for japanese euc-jp can be defined as
follows,
user_pref("print.psnativeencoding.euc-jp","euc-jp");
user_pref("print.psnativefont.euc-jp","Ryumin-Light-EUC-H");
* Continue to use unicodeshow interface with unicode based code
point
It means this approach will not break the existing filters
such as ttf2ps and wprint. The unicode code point can be used
with unicode based CID fonts too, so I keep the code point as
unicode.
* Unicode based CID fonts can be supported
Unicode based CID fonts would be popular in CJK. Mozilla
should have the support. UniJIS CMaps are available for
japanese. UniKS for Korean, UniCNS and UniGB are available
for Chinese. When printer and ghostscript don't have them, we
can install unicode based CMap to the targets easily. The
unicode code point can be used with the CMap.
user_pref("print.psunicodefont.euc-jp","Ryumin-Light-UniJIS-UCS2-H");
* Code conversion for native encoding in Mozilla
Code conversion from unicode to native (e.g. euc-jp) is
performed in Mozilla and define a dictionary of the
conversion table in outputs.
/Unicode2NativeDict 0 dict def
Unicode2NativeDict 12363 42155 put % 0x304b -> 0xa4ab
the native code point can be retrieved from the dictionary
with the code point at runtime,
Unicode2NativeDict 12363 get % get 42155
I'd like to explain how unicodeshow() is drawing a japanese glyph
0x304b (EUC 0xa4ab). The following lines will be generated from
Mozilla.
...
/NativeFont /Ryumin-Light-EUC-H def
...
Unicode2NativeDict 12363 42155 put % 0x304b -> 0xa4ab
(\113\060) unicodeshow
The outputs are processed by PostScript engine (printers or
ghostscript),
1. check Unicodedict and check the rendering routine of the
character is defined, use the rendering routine for
rendering.
unicode code point 0x304b is used. In this case, Unicodedict
is not defined, so go to the next step.
2. when Unicode based CID font (e.g. Ryumin-Light-UniJIS-UCS2-H)
is defined and can be used in PostScript VM (printer or
ghostscript) use the CID font and the unicode code point for
rendering
The checking of font availability can be done at runtime.
unicode code point 0x304b is used in show with the UCS2 based
CID fonts.
In this case, it is not defined, so go to the next step.
3. when native PostScript font (e.g. Ryumin-Light-EUC-H) is
defined and can be used in PostScript VM, use the native
PostScript font and native code point for rendering
EUC code 0xa4ab for the 0x304b is defined in
Unicode2NativeDict. The code conversion is already performed
in Mozilla at printing. It's easy to get the code from the
dictionary and use it.
Unicode2NativeDict 12363 get % get EUC 0xa4ab
In this example, both native PostScript font and the code
point are defined, so the character is rendered by
<a4ab> show
with Ryumin-Light-EUC-H.
4. draw a rectangle as undefined character
The character isn't rendered in the step 1) - 3), we need to
draw the undefined glyph. I've fixed "han" problem. All
undefined character will be drawn as a rectangle.
Please try the examples in attachment on English PostScript
printers or English ghostscript, you will see rectangles.
We need an extra preference of controlling the order of step 1)
and 3). For example, in japanese locale, Mozilla will check the
native PostScript font is usable first before rendering by
Unicodedict.
user_pref("print.psfontorder.euc-jp", 1);
// 0: check Unicodedict first
// 1: check native PostScript font first
By the changing of above, Mozilla will be able to print japanese
characters as well as Netscape 4.x.
For utf-8 charset, what do you think about the case below? We need
to consider carefully the case when the target encoding is "utf-8".
1. a japanese user started Mozilla in japanese locale
2. japanese document (written in euc-jp, iso-2022-jp, shift-jis) could
be printed out properly
yes, native PostScript font is now supported
3. japanese document (but written in utf-8) could not be printed out
Users need to prepare post-processing... It's inconvenient.
Most users don't want to care the encoding of target document
because the user started Mozilla in japanese locale.
Is it acceptable? I don't think so. So I believe we should support
native PostScript font and native encoding support even in
"utf-8". Even if charset is "utf-8", Mozilla is running in
japanese locale, so can we assume the document is "japanese"? I
suppose it is possible to define the following lines in ja-JP
preference. Note that these are only for ja-JP, which will be
available only when Mozilla is started in japanese locale.
user_pref("print.psnativeencoding.utf-8","euc-jp");
user_pref("print.psnativefont.utf-8","Ryumin-Light-EUC-H");
user_pref("print.psunicodefont.utf-8","Ryumin-Light-UniJIS-UCS2-H");
Also, ko-KR preferences would be the following,
user_pref("print.psnativeencoding.utf-8","euc-kr");
user_pref("print.psnativefont.utf-8","Haeseo-KSC-EUC-H");
user_pref("print.psunicodefont.utf-8","Haeseo-UniKS-UCS2-H");
It would be better that we can define the code range that valid
code point for the specified font. Characters that within the
range will be drawn by the specified fonts, but the other characters
will be rendered by Unicodedict if it's defined.
user_pref("print.psunicoderange.utf-8","<from_code><to_code>,
<from_code><to_code>,....");
Do you think this will work for your locale? What common PS font
does your locale have in ps printer and ghostscript? Any comments
and suggestions would be appreciated.
6 files for your reference.
* http://village.infoweb.ne.jp/~katakai/mozilla/nsPostScriptObj.cpp.txt
Temporary patch for gfx/src/ps/nsPostScriptObj.cpp. Note that
the code change isn't finished and psunicoderange feature
isn't implemented, charset name and font name are hardcoded.
It's just for reference.
Btw, does anyone know how to get the charset name (e.g.
euc-jp and utf-8) of the target documentation in gfx/src/ps/
or gfx/src/gtk/nsDeviceContextGTK.cpp?
* http://village.infoweb.ne.jp/~katakai/mozilla/yahoo_ja.ps
outputs of www.yahoo.co.jp
* http://village.infoweb.ne.jp/~katakai/mozilla/yahoo_ja.jpg
snapshot of ghostscript with japanese font
* http://village.infoweb.ne.jp/~katakai/mozilla/yahoo_ja_undef.jpg
snapshot of ghostscript without japanese font. Now rectangles
could be drawn instead of "han".
* http://village.infoweb.ne.jp/~katakai/mozilla/yahoo_ko.ps
outputs of kr.yahoo.com. I assume Haeseo-KSC-EUC-H as Korean
fonts
* http://village.infoweb.ne.jp/~katakai/mozilla/yahoo_ko.jpg
snapshot of ghostscript with korean font
Thanks,
Masaki