Re: Inverted 2 and inverted 3 in Unicode?
From: Tak To (takto_at_alum.mit.edu.-)
Date: 09/21/04
- Next message: Jacques Guy: "Re: Shofar--Dravidian?"
- Previous message: Rex F. May: "Re: Deeper relations of Chinese"
- In reply to: Harlan Messinger: "Re: Inverted 2 and inverted 3 in Unicode?"
- Next in thread: Jacques Guy: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Jacques Guy: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Harlan Messinger: "Re: Inverted 2 and inverted 3 in Unicode?"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 20 Sep 2004 20:11:24 -0400
Harlan Messinger wrote:
HM.2> Then I'm not sure what you're looking for. I thought you were
HM.2> trying to justify Unicode just listing radicals instead of
HM.2> characters, on the grounds that the characters can then just
HM.2> be defined as combinations of radicals.
I did not suggest Unicode doing anything but I have said that
(1) a Chinese character (<zi4>) can be defined as a combination of
radicals just as an English word is defined as a series of
letters;
(2) that modern computer technology may very well be able to
layout a Chinese character on the fly based on the series
of strokes/radicals using a "font" of radicals.
HM.2> I've demonstrated that it's not as simple as you assert.
I don't know what "simplicity" you think I have implied by
the above; therefore I don't understand your demonstration at
all. To wit, you merely said, "it is not that X [X=simple]" then
"prove me wrong"; and when I showed a general scheme, you said
"See? It is not that X". Perhaps it would help if you would
describe more in more details what you consider to my assertion
of simplicity.
HM.2> I also don't understand *why*
HM.2> you would want to do it that way.
My point was that the "abstract character" in Unicode is based in
practical consideration rather than consistent logic -- e.g., some
"abstract characters" are atomic while others are composites.
----- -----
TT.1> Laying out a Chinese character
TT.1> is similar to laying out a web page, so I imagine a reasonable
TT.1> setup would be that there will be a description of the character
TT.1> representable by some sort of markup language, a default
TT.1> "style***" defining a set of reasonably asethetic proportions
TT.1> (e.g., that the "four_dots" radical should be half as tall as
TT.1> the component above it), and a collection ("font") of (scalable)
TT.1> radicals and strokes. These three components are more or less
TT.1> independent of each other. In addition, there should probably
TT.1> be a provision to override some of the parameters at the generic
TT.1> radical, generic character, or individual (occurrence of)
TT.1> character level (e.g., via additional "stylesheets").
TT.1>
TT.1> A layout of the <hei> character will be something like:
TT.1>
TT.1> <obj name=#upper
TT.1> <obj name=#box1
TT.1> <obj name=#box
TT.1> <name=#b1 stroke=down>
TT.1> <name=#b2 stroke=ne_corner beg=top(#b1) ratio=.5 >
TT.1> < stroke=cross beg=bot(#b1) end=bot(#b2) >
TT.1> </obj>
TT.1> <stroke=\_dot loc=left( in(#box)) >
TT.1> <stroke=/_dot loc=right(in(#box)) >
TT.1> </obj>
TT.1> <name=#v0 stroke=down beg=mid(top(#box1) len=2*ht(#box) >
TT.1> <name=#h1 stroke=cross mid=3/4*ht(#v0) wd=narrower(#h2) >
TT.1> <name=#h2 stroke=cross mid=bot(#v0) wd=wider(#h1) >
TT.1> </obj>
TT.1> <stroke=four_dots loc=below(#upper) >
HM.2> As the font variety demonstrates, this is vastly insufficient:
HM.2> You can't assume some fixed way in which characters will be
HM.2> built from their parts.
I am not sure what you considered by "fixed" in my description above
and hence why it is "insufficient". For example, as indicated above,
#v0 should always end right at #h2, never passing through the latter
or not touching it. Is that something you considered to be "fixed"?
Note, however, that this is indeed observed by all the varieties in
your sample page. Nonetheless, this can still be overridden by using
a "font" whose glyph of a vertical stroke has a "visible length"
longer or shorter than its "logical length".
TT.1> Btw, this is the stroke order that kids are taught to write in.
HM.2> The fact that it will suffice to duplicate the way children
HM.2> are taught to write the characters in school is irrelevant.
I just mentioned that there is a canonical ordering of strokes
and radicals, that's all.
TT.1> In practice, the shape formed by the #box1 is probably common
TT.1> enough to be a radical by itself (box_with_two_dots); so are
TT.1> the two horizontal strokes (forming the character for "two").
TT.1> These two will be abstracted out as radicals[*]. The layout
TT.1> will be something like:
TT.1>
TT.1> <obj name=#upper>
TT.1> <name=box radical=box_with_two_dots >
TT.1> <name=two radical=two loc=below(#box) >
TT.1> <stroke=down beg=mid_top(#box) end=mid_bot(#two) >
TT.1> </obj>
TT.1> <stroke=four_dots loc=below(#upper) >
TT.1>
TT.1> This would describe the layout of a generic <hei>.
TT.1>
TT.1> [*] Actually, <hei> is traditionally its own radical.
TT.1>
TT.1> Just as HTML is not the same as free-format (e.g., it does not
TT.1> allow laying out text along an arbitary curve), this setup does
TT.1> not not support all possible shapes for each character, but will
TT.1> be sufficient for most practical usage.
HM.2> On the contrary, font designers shouldn't be limited by the
HM.2> children's construction-set version of a Unicode that you think
HM.2> is superior to the current one. Why should they be?
By "font designer", I am not sure whether you are referring to someone
designing radicals or someone who designs whole characters. In any
case, what "limitation" are you referring to? It is true that not
all images of Chinese character can be analyzed as radicals, but
not all iamges of English words can be analyzed as a horizontal
juxtaposition of letters either. E.g.,
http://www.commarts.com/CA/exhibit_d/101402/images/mod.gif
http://www.paceprints.com/contemporary/indiana_r/images/indiana-love_280-006-000.jpg
So what is the big deal?
TT.1> Most of the samples of <hei> in your reference
TT.1> are too stylized to be practical fonts. They are more like dingbats.
HM.2> "Your examples disprove my theory, so I will self-servingly complain
HM.2> that they are too stylized and therefore don't count."
I am not sure if you have demonstrated anything. It was merely your
assertion that "things are not that simple", whatever that means.
I readily admit that I have not proved anything either -- I just
outlined a rough idea of how things can be done. And I have not
been inconsistent, if that's what you mean by "self-serving".
TT.1> Still, most of them can be constructed by the setup I described
above.
HM.2> No, they can't.
Yes they can.
This is getting nowhere. :-(
Tak
--
----------------------------------------------------------------+-----
Tak To takto@alum.mit.eduxx
--------------------------------------------------------------------^^
[taode takto ~{LU5B~}] NB: trim the xx to get my real email addr
- Next message: Jacques Guy: "Re: Shofar--Dravidian?"
- Previous message: Rex F. May: "Re: Deeper relations of Chinese"
- In reply to: Harlan Messinger: "Re: Inverted 2 and inverted 3 in Unicode?"
- Next in thread: Jacques Guy: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Jacques Guy: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Harlan Messinger: "Re: Inverted 2 and inverted 3 in Unicode?"
- Messages sorted by: [ date ] [ thread ]