Re: Iroha in Kanji?



This thread is amusing. It is either a classic
example of people talking (writing) past each other,
or one of the most subtle mind-game scripts playing
out that I have encountered in some time.

If I understand correctly, the OP wants to write
some computer software that will offer a user
examples of available fonts that may be suitable
for the user's apparently intended purpose.

He wants the examples to be really useful, not
just the first x-many "characters" in the
character set (which are usually ASCII, as
most character sets are a superset of that).

OP explained what he wanted, in terms of
visions of ultimate utility, and responders
rightly said, WTF? That can't be done. And
why would you even try to do that?

And OP replied that "that" (what he really wants
to do) certainly can be done, and the exchange
goes on and on....

One responder mentioned a book on Japanese
hirigana, katakana, and Kanji, and that might
actually address the issue for the OP.

There are a few books on CJK language processing
that OP might find of use. I don't remember
the titles or authors off hand, so I will
simply suggest at trip to the library or a
bit of googling.

But prior to that, I would suggest (strongly) OP
go to a good local library and find a copy of
Kenkyusha's New Japanese-English Dictionary,
Fourth Edition (not fifth) and examine page
xiii of the preface.

The symbols in the body of the table (excluding
the right-hand column) are the katakana symbols,
including combinations that must be available
in the font for special purposes, together with
romanization. In some cases, two forms of
romanization are included, depending on whether
one needs lossy or less-less conversions. There
are at least three other forms of romanization
that could be included, but for practical purposes
these should be ignored in the initial effort.
Adding them later would suffice.

For every katakana symbol or combination appearing
on that page, there is a corresponding hiragana
symbol or combination to be dealt with.

Examples of the kanji characters should consist of
some arbitrary set of "typical" kanji displaying
features of the font, plus a presentation of all
kanji support by the font. Selection of the kanji
for the preview small set of kanji should depend
on the font, and show its virtues/demerits. This
properly is a task for those who created or will
use the font.

For the OP's edification, mixing kanji, hiragana,
and katakana may be difficult for his purpose(s),
but this is exactly the way the Japanese language
appears in written form. If he wants the user
to inspect his examples and see language as it
normally appears, he will have to deal with it.

The Japanese Iroha Uta appearing at
http://en.wikipedia.org/?title=Pangram
provides a first line containing exclusively
hiragana, and a second line that is encoded in
a mixture of kanji and hiragana as it would
normally be written in contemporary Japanese.
It does not address marking for voiced and
semivoiced variations nor does it address
combinations required for special purposes
(e.g. te-(subscript)i to create ti, because
the standard representation of ti is pronounced
as chi and may be romanized in that manner).

OP knows he is in over his head, but does
not really realize how deep the water is nor
how treacherous the undercurrents. My advice
would be that he ignore Japanese (and Korean)
in his first version of the software, and let
the Japanese and Koreans add the sections for
their languages if his approach warrants the
effort. Or, he may wish to find a native
speaker of the language who is a competent
programmer and is willing and able to work
with him.

Cheers!

jim b.

--
UNIX is not user-unfriendly; it merely
expects users to be computer-friendly.
.



Relevant Pages

  • Re: Iroha in Kanji?
    ... actually show you what the font will look like. ... Japanese, Hiragana, Katakana, Korean, Bopomofo, Hanyu Pinyin, etc. ... Katakana is not a language, ...
    (sci.lang.japan)
  • Re: Characters not being displayed when language settings are changed.
    ... As for the original question - if changing the language for non-unicode programs affects the way you see things, the program you're using must be a non-Unicode program. ... As such it will be using a code page to display non-ascii characters, and changing the language will have changed the code page in use - I think! ... But I think that MS Mincho is installed with Office/Windows by default because there are certain characters in that font that Word uses even when you think you're entering something in the Default Paragraph Font. ... Japanese IME? ...
    (microsoft.public.word.docmanagement)
  • Kanji, Katakana, and Hiragana are different LANGUAGES.
    ... I hope there are some bilingual Japanese citizens who can comment on ... Kanji, Katakana, and Hiragana are different LANGUAGES. ... language, which translates today in Japan to kanji. ...
    (sci.lang.japan)
  • Re: Kanji, Katakana, and Hiragana are different LANGUAGES.
    ... Feel free to post in Japanese if you like. ... Citizenship and language are two unrelated issues. ... Generally Chinese is written in kanji. ... kanji or kana. ...
    (sci.lang.japan)
  • Re: Learning Japanese - trying to type in word
    ... on being able to read the language first. ... translate the word into the appropriate Japanese kanji or hirgana / ... Second, there are several romaji systems for Japanese, with different ... learning words, and writing those words in kanji. ...
    (sci.lang.japan)