The company where I used to work, Omron Corporation, was kind enough to allow me time and facilities to play with this stuff. Omron is a world leader in switches, relays, PLCs, etc., but is not well known by the non-Japanese consumer world. I have run into a few Omron cash registers in The States, and medical personnel tend to know the name.
For example, if you look at the HTML files, you'll find that they don't exist... all the ``files'' and directories you think you see exist only in the virtual world of a CGI I've written. Mmmm..., a virtual world within the virtual world of the Web... I guess it's a quasi-pseudo-neo-demi-reality! :-)
Besides doing the dictionary searches for you, the CGI massages the prose files (in a source format I call quasi-HTML) on the fly, adjusting things for your browser type, and selection of language, Japanese support, favorite image size, etc.
SASE(Self-Addressed Stamped Envelope) that might be attached to the URL. In fact, if you add
?SASE=http://Your_Favorite_Placeto the end of the URL that you use to contact the server, you'll have it placed in with the buttons at the bottom of the initial page. Neat!
All the images of Japanese text are produced on the fly by a perl routine that generates gifs (monochrome only, although they can be ``transparent'' as an option) from random bitmap data (where here ``random'' means ``from the font file'' :-) To reduce the load on the Web server machine, the CGI first tries to access a Japanese GIF Server running on a different system. This will hopefully keep things fast. If the gif server is down, the CGI will just go ahead and generate the text itself.
firstname.lastname@example.org) for the suggestion to add filename-like glob patterns, which he rightly points out most people know better than raw regular expressions. Many others have helped with bug reports and suggestions of all kinds.
email@example.com). This includes the main dictionary data, the name data (originally part of the main data), and the kanji data.
Jim's work was once copyrighted by him, but in 2000, he assigned the copyright to the Electronic Dictionary Research and Development Group. Consequently, you should be aware of the licence governing the data used at this site.
Many, many people have contributed to the quantity and quality of the main dictionary data, edict. This includes not only adding new entries, but correcting current ones. I personally have spent hundreds of hours over the last half dozen years working on scripts to check for errors, consistency, etc., besides adding numerous entries. And when it was at about the 100,000 entry level, Dr. Yo Tomita proofread the entire thing, fixing countless errors. These are just two examples -- others have expended even more energy. Please see edict's documentation file for a list of those that deserve your appreciation for making this data possible.
The kanji dictionary data, kanjidic, is also the collaborative effort of many. Coordinating and verifying the data has been done mostly by Jim and friends, but perhaps most of the work has been done behind the scenes. For example, in 1992 I spent three months of my life entering a whole host of data as I found it in Jack Halpern's ``New Japanese-English Character Dictionary''. Yet this mechanical work is nothing compared to the 18 years Jack spent creating his dictionary. The Korean readings, added by Jim in one quick moment in March 1996, were painstakingly developed over the course of 10 years of research by Charles Muller. These are just two examples from the credits found in kanjidic's documentation file.
About 22 years, 3 months ago (February '96), Jim split the data into
two files, the main dictionary which kept the name
name-only entries (which had swelled in recent months) to the new file
These files, as well as lots of other goodies, can be found at
Among the ``lots of other goodies''
you'll find jdic, Jim's own DOS
interface to edict and kanjidic. It allows file viewing and dictionary
searches on your local PC without the need for any other Japanese support
(and since it's local, is much faster than this server!).
There are also versions for DOS/Windows,
(2.1 k-byte blurb here), and the
Macintosh. The jdic family has quite a different look and feel
from this server, and if your platform supports it, you're encouraged to
try it. If you find you like the look and feel of this server, and are
on a Unix System, you might consider my lookup program.
If you'd like to see the good doctor, you can as a 8.5 k-byte jpeg or a 76 k-byte gif of the same shot.
Note: please don't contact Jim in regards to this WWW server -
contact me (Comments appreciated) with questions or comments about this server.
Questions about the dictionary data, or the tools mentioned above, should
go to Jim (perhaps with a
CC to me, if you like).
Dictionary of Legal Terms
Also available at Monash are the
lawgloss.* files, produced by
the University of Washington School of Law, Asian Law Program (their
Copyright, 1995). I massaged the file, turned it into
format, corrected a number of errors, and added it to my server.
Here's Jim's note about
Dictionary of Life-Science Terms
Similarly, Monash has the
lifscdic.* files, produced by a
group of Japanese bioscientists from Bio-Net, lead by Dr Shuji Kaneko of
Kyoto University. Here's Jim's note about
Dictionary of Four-Character Idiomatic Compounds
The first dictionary in a series of new additions to this server is a
compilation based upon a list created by Kanji Haitani. These are yojijukugo
or four-character compounds that have been singled-out as commonly ocurring.
Jim's commentary and Kanji Haitani's note about
Dictionary of Aviation Terms
This is another EDICT-formatted dictionary by Jim Breen of Ron Schei's
English/Japanese Aviation Dictionary. Jim's note about
aviation. This server is running a newer dictionary that was
recently proof-read by Teijo Kaakinen.
Dictionary of Computer Terms
This dictionary was compiled by Jim in 1997. It is a glossary of terms used
in the computing and telecommunications fields. Jim's detailed
note describing the sources used to create this file.
Dictionary of Compound Verbs
This is actually already included in the EDICT file, but is reproduced on this
server for convenient searching. Jim Breen created this file from the book
"Handbook of Japanese Compound Verbs" by Yoshiko Tagashira and
Jean Hoff (Hokuseido Press, 1986). Jim's note on
Dictionary of Concrete Terms
Gururaj Rao produced this file in the course of his translation work dealing
with technical reports mainly related to concrete and concrete structures.
Gururaj Rao's note about
Dictionary of Financial Terms
This file contains a listing of financial terms compiled by Kevin Seaver,
released to the Hoyaku WWW page nd converted to the EDICT format by Jim Breen.
Jim's note on
Dictionary of Geological Terms
This is a list of geology terms put together by Jim Breen, based on two sources.
Jim's note on
Dictionary of Japanese Place Names
This is Jim Breen's compilation of Japanese place names that were extracted
from the web pages of the Japanese Ministry of Posts and Telecommunications.
Jim's note about
Dictionary of Computational Linguistic Terms
This dictionary, compiled and maintained by Francis Bond, is a list of terms
used in theoretical and computational linguistics. Francis
Bond's note explains the history behind this compilation and how to get in
touch to contribute.
Dictionary of Marketing Terms
This is Jim Breen's compilation of Adam Rice's business and marketing terms.
Jim's note about
mktdic has more details
about the tags used to denote their origins in the Honyaku WWW pages.
Dictionary of Pulp and Paper Terms
This is Jim Minor's compilation of pulp and paper industry terms.
Jim's note about
Dictionary of Constellation Names
Raphael Garrouty's list of constellation names. There's no further
documentation on this file.
Dictionary of Enginering and Science Terms
This is a conversion an original Macintosh text file that came as part of
a conference kit at an acoustics conference supplied to James Friend. He
passed this on to Jim for conversion. Jim's note about
Dictionary of Forestry Terms
Juan Manuel Cardona Granda provides a collection of forestry terms originating
from Japanese forestry journals. Mr Granda's note in
Dictionary of Environmental Terms
This is a glossary of environmental terms which frequently appear in Japanese
environmental reports, etc. Patrick Oblander's note about
Dictionary of River and Water Resources Terms
This is a compilation of the River and Water Resources Glossary produced
by the Infrastructure Development Institute of Japan. Jim Breen converted
the compilation into EDICT format. Jim's note about
Dictionary of Manufacturing terms
This file was prepared by Jim Breen and is derived from several web pages
describing paper, molding and general manufacturing. Jim's
Dictionary of Buddhist Terms
Chuck Muller has assembled a large Digital Dictionary of Buddhism. Thanks to Jim Breen, the XML extract that Chuck
provided him was converted to EDICT-style format, so that it could be made
easily available to tools that could already deal with EDICT files.
Jim's note about
Dictionary of Chemical Terms
This dictionary is compilation gathered from an online catalogue of chemicals
produced by Showa Chemicals Ltd. This catalogue is volume 28, consolidated
as of August 2002.
The original catalogue was produced in Adobe PDF format. Using Adobe Acrobat Distiller, the individual files were then saved as RTF documents, which were then read into Microsoft Word 2000. Using that word processor, they were then saved as encoded text, in Japanese EUC encoding. After that, a perl script was used to extract the contents and produce a rough draft of chemdict in files separated according to Showa Chemicals' index.
Final proofing was done manually.
William F. Maton, 20090321
William's note about
How about all those fonts?
I use the publicly-available ``kanji##.snf'' JISX0208-1983
fonts. Frankly, I don't know their origin, but I'm thankful to whomever
provided them. The COPYRIGHT says ``Public domain font. Share and enjoy''.
So I do.
The vertical fonts were generated by converting the SNF format files back to BDF on a Sun. Then, using Mark Leisher's gBDFEd editor, I created a vertically set version of each of these fonts, and converted them back to their SNF counterparts. I am grateful to Dr. Ken Lunde for his invaluable assistance in July 1997 in identifying which glyphs in JIS-X-0208 have vertical variants.
The font properties were then adjusted in gBDFEd to match those found in the original untouched SNF fonts used for JIS-X-0208.
The JIS-X-0212 fonts were created by obtaining the Sazanami open Truetype font produced by the EFont project, and using Mark Leisher's TTF2BDF program to extract BDF versions of the appropriate glyphs on a Linux machine. These were then converted into their SNF counterparts on a Sun using the bdftosnf program. The Sazanami font is copyrighted by Wada Lab, and here is a blurb about that.
The full command to make the conversions between the Sazanami TrueType fonts and their BDF counterparts is, using pixel size of 48:
ttf2bdf -c C -m JIS0212.TXT -rh 75 -rv 75 -p 48 -o test.bdf sazanami-20040629/sazanami-mincho.ttf
The JIS0212.TXT mapfile came from the UNICODE consortium FTP site, in the "obsolete" section. -wfms
Comments appreciated [Return to Change Log] [Return to Main Page] [Jump to Index]