Re: Koohaku Uta Gassen



jwb@xxxxxxxxxxxxxxxxxx wrote:

The Wanderer <inverseparadox@xxxxxxxxxxx> dixit:

Personally, I'd advocate using a strict "one kana, one romaji
representation; one romaji representation, one kana" romanization
scheme for any place where exact matching is important,

I've been advocating that for decades. And using such a one too.

I count myself pleasantly surprised. ^_^

but unfortunately the only such scheme I know of offhand (which can
be written entirely in characters available on the standard English
keyboard) is the one I invented and use myself - which I would not
remotely recommend putting into general use, despite how well it
works for me.

The ワープロローマ字 I use to transliterate names in ENAMDICT is such a one, and it is completely round-trip-safe. It is basically a unique ローマ字 pattern for every kana: とう->tou, とお->too, etc. ぢ-> dji, づ->dzu. I use _'_ for things like こんやく/kon'yaku.

And how does it account for things like "small" characters, i.e. the small-tsu "vowel extenders"? I've never come across *any* romanization scheme which had a consistent, easily comphrehensible means of representing those, other than my own, and that feature of my own is probably its clunkiest aspect.

I strongly suspect that my own system will bear a significant
resemblance to yours, with a few exceptions. I use "ji" instead of "dji"
for ぢ, and "n^" instead of "n'" for ん (and I use it regardless of
whether or not the situation would otherwise be ambiguous - as I said,
strict one-to-one) because by the time I realized that need I was
already using the apostrophe as convenient shorthand for the small-tsu
extenders; I represent "small" characters by enclosing each one in
reverse angle brackets, i.e., ">tsu<" or ">ya<". I use a parallel system
in all caps to represent katakana. (For the little it's worth, I also
represent kanji by enclosing the context-appropriate reading for each
one in square brackets.)

I think that's an exhaustive description... the system is not quite
perfect (it doesn't have any way to represent accent marks on characters
that normally don't get them, such as the bare vowels or (random
example) や, which I've seen used in manga to denote certain forms of
vocal intensity), but it's worked quite well so far.

(Random question I've been meaning to ask, but haven't worked up the
guts to start a new thread over: how do the Japanese refer to what I
just described above as "accent marks", and how do they refer to the
relationship between i.e. the characters か and が?)

--
      The Wanderer

Warning: Simply because I argue an issue does not mean I agree with any
side of it.

Secrecy is the beginning of tyranny.
.



Relevant Pages

  • Re: Megami71
    ... In my romanization scheme, such characters are ... represented simply by enclosing the representation of the larger character ...
    (rec.arts.anime.misc)
  • Re: Strings and bindary data
    ... String to another file, are those files guaranteed to be identical? ... No. Strings are designed to hold textual data, and that /always/ is subject to ... beyond what is in its representation. ... representation -- that's to say a mapping from abstract characters (or ...
    (comp.lang.java.programmer)
  • Re: A problem to read a SEG-file using Fortran90 code on PC
    ... IBM EBCDIC codes for characters convert to normal ascii codes, ... from IBM mainframe sources.. ... If you have floating point values stored in a bit representation you ...
    (comp.lang.fortran)
  • Re: Is ECMAScript really a dialect of Lisp?
    ... | - from a sequence of characters to objects; ... symbols, conses, and self-evaluating objects. ... evaluation model is defined upon them. ... the specific nature of the representation depends on its contexta. ...
    (comp.lang.lisp)
  • Re: "Unprintable" 8-bit characters
    ... all proper concepts of doing things. ... single-byte representation for each character in the set, ... on the font in use, _if_ it has this mapping or that, ... to use, at the very least, the "Western" characters (as opposed to ...
    (freebsd-questions)