Re: Inverted 2 and inverted 3 in Unicode?
From: Harlan Messinger (h.messinger_at_comcast.net)
Date: 09/17/04
- Next message: Ruud Harmsen: "Re: History of French"
- Previous message: Ruud Harmsen: "Re: History of French"
- In reply to: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Next in thread: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 17 Sep 2004 16:11:21 -0400
"Tak To" <takto@alum.mit.edu.-> wrote in message
news:Z-2dnRn0Wp3SgdbcRVn-uA@comcast.com...
> Tak To wrote:
> TT.1> Likewise for hangul blocks. It is not too far fetched to imagine
that
> TT.1> computers equipments in the future can compose "on the fly" Chinese
> TT.1> "characters" (with a font of "radicals"),
>
> Harlan Messinger wrote:
> HM.2> A table that gave all the rules to combine radicals (including
> HM.2> positioning information, sizing information, modification
information,
> HM.2> etc., for each radical) would be entirely font-specific, not
> HM.2> necessarily generalizable from character to character, and therefore
> HM.2> at least as complicated as a table of all the characters, so why
would
> HM.2> you do that?
>
> TT.3> If by "rules" you mean detailed inter-radical spacing information
then
>
> HM.4> yes;
>
> TT.3> if you mean the actual composition of radicals and rough relative
position
> TT.3> then no. For example, the character <lin2> 林 u+6797 ("forest")
could be
> TT.3> encoded as <wood radical><right><wood radical> regardless of which
> TT.3> font is being used. Stroke order (and thus radical order) is
actually
> TT.3> standardized and taught to school children in PRC and Taiwan alike.
>
> HM.4> What method of defining all characters in terms of their components,
no
> HM.4> matter how much sizing and positioning information were to be
provided,
> HM.4> would suffice for the character for Mandarin hei(4) ("black") to be
rendered
> HM.4> correctly in all the different fonts shown in Figure 12-4 at
> HM.4> http://www.newworldcider.com/COMM_590/article1.htm
> HM.4> ?
>
> I am not sure what you are looking for.
Then I'm not sure what you're looking for. I thought you were trying to
justify Unicode just listing radicals instead of characters, on the grounds
that the characters can then just be defined as combinations of radicals.
I've demonstrated that it's not as simple as you assert. I also don't
understand *why* you would want to do it that way.
> Laying aying a Chinese character
> is similar to laying out a web page, so I imagine a reasonable setup would
> be that there will be a description of the character representable by some
> sort of markup language, a default "style***" defining a set of
> reasonably asethetic proportions (e.g., that the "four_dots" radical
> should be half as tall as the component above it), and a collection
("font")
> of (scalable) radicals and strokes. These three components are more
> or less independent of each other. In addition, there should probably be
> a provision to override some of the parameters at the generic radical,
> generic character, or individual (occurrence of) character level (e.g.,
> via additional "stylesheets").
>
> A layout of the <hei> character will be something like:
>
> <obj name=#upper
> <obj name=#box1
> <obj name=#box
> <name=#b1 stroke=down>
> <name=#b2 stroke=ne_corner beg=top(#b1) ratio=.5 >
> < stroke=cross beg=bot(#b1) end=bot(#b2) >
> </obj>
> <stroke=\_dot loc=left( in(#box)) >
> <stroke=/_dot loc=right(in(#box)) >
> </obj>
> <name=#v0 stroke=down beg=mid(top(#box1) len=2*ht(#box) >
> <name=#h1 stroke=cross mid=3/4*ht(#v0) wd=narrower(#h2) >
> <name=#h2 stroke=cross mid=bot(#v0) wd=wider(#h1) >
> </obj>
> <stroke=four_dots loc=below(#upper) >
>
> Btw, this is the stroke order that kids are taught to write in.
As the font variety demonstrates, this is vastly insufficient: You can't
assume some fixed way in which characters will be built from their parts.
The fact that it will suffice to duplicate the way children are taught to
write the characters in school is irrelevant.
>
> In practice, the shape formed by the #box1 is probably common enough to
> be a radical by itself (box_with_two_dots); so are the two horizontal
> strokes (forming the character for "two"). These two will be abstracted
> out as radicals[*]. The layout will be something like:
>
> <obj name=#upper>
> <name=box radical=box_with_two_dots >
> <name=two radical=two loc=below(#box) >
> <stroke=down beg=mid_top(#box) end=mid_bot(#two) >
> </obj>
> <stroke=four_dots loc=below(#upper) >
>
> This would describe the layout of a generic <hei>.
>
> [*] Actually, <hei> is traditionally its own radical.
>
> Just as HTML is not the same as free-format (e.g., it does not allow
> laying out text along an arbitary curve), this setup does not not
> support all possible shapes for each character, but will be sufficient
> for most practical usage.
On the contrary, font designers shouldn't be limited by the children's
construction-set version of a Unicode that you think is superior to the
current one. Why should they be?
> Most of the samples of <hei> in your reference
> are too stylized to be practical fonts. They are more like dingbats.
"Your examples disprove my theory, so I will self-servingly complain that
they are too stylized and therefore don't count."
> Still, most of them can be constructed by the setup I described above.
No, they can't.
- Next message: Ruud Harmsen: "Re: History of French"
- Previous message: Ruud Harmsen: "Re: History of French"
- In reply to: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Next in thread: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Reply: Tak To: "Re: Inverted 2 and inverted 3 in Unicode?"
- Messages sorted by: [ date ] [ thread ]