Japanese Character Encoding
Note
If you're trying to get the Japanese to work with your browser on a
Windows system,
you'll want to check here.
To encode English characters on a computer, ASCII is commonly used.
For Japanese characters, there are three common encodings. But ``common'' only
in a certain sense, for most browsers don't support any form of Japanese.
Support isn't that difficult, technically, but in this English-centric
software world.... sigh.
If your browsers does not support Japanese, this dictionary server can
still be useful. It will send Japanese text encoded as individual images...
a slow process, but effectively allowing any viewer that supports gif
images to ``support'' Japanese.
But it is slow, and if possible, you are encouraged to use
WWW browsers that support Japanese, if possible.
Encoding Types
The three common encoding types are:
- JIS.
Japanese Industrial Standard. The JIS group is more or less the
Japan version of ANSI (American National Standards Institute), with
a touch of UL (Underwriter's Laboratory, a private corporation)
thrown in. The term ``JIS'' is used in the computing community to refer
to their encoding standard. It is probably the best for communication
purposes, as it's a 7-bit code more or less using escape sequences and
ASCII characters to encode the Japanese.
You are STRONGLY encouraged to use JIS if you can. In
particular, most browsers will recognized all three encoding types, but
only for JIS will it automatically switch to a Japanese font
when it needs to.
- EUC.
Extended Unix Code. Pronounced ``Eee you see''. Actually, a subset
of a more widely-scoped (but under-implemented) method of encoding many
of the worlds various languages. This server works with EUC internally.
EUC is more or less JIS without the escape sequences, and the 8th bit
turned on in encoded bytes. Sometimes this can be written as EUC-JP.
- Shift-JIS.
Commonly known as Shit-JIS, this piece of Microsoft brain damage is
unfortunately the encoding commonly used on PCs.
- UNICODE.
Huh, I thought you said three? Well, Unicode is another standard
developed by
The Unicode Consortium which is gaining popularity. Most
of the world's written languages can be represented with Unicode.
On the web, Unicode is written using UTF-8, which this server can
display.
Here are samples of each encoding you might use to see
if anything is supported on your system.
The last word on Japanese Encodings
To learn more than you ever wanted to know about all this stuff,
you want Ken Lunde's
Understanding Japanese Information Processing published by O'Reilly and Associates, its successor,
CJKV Information Processing, and
CJKV Information Processing, Second Edition, all published by O'Reilly and Associates.
Comments appreciated [Return to Change Log]
[Return to Main Page]
[Jump to Index]
(this page's master source last modified 9 years, 2 months ago)
This reply to request 228,046 made just for you Thu April 26th 2018 4:47pm JST [load currently averaging 24822 requests/day over a 181-second sample]