Re: Best encoding for a Japanese web site to deliver?

From: Maciej Katafiasz (mnews2_at_wp.pl)
Date: 02/04/05


Date: Fri, 04 Feb 2005 02:52:24 +0100

On day Thu, 03 Feb 2005 20:26:39 -0500 looked the Lord upon his prophet,
and the prophet was numbered among those called Timmy Douglas. And thus
spake the prophet:

> Maciej Katafiasz <mnews2@wp.pl> writes:
>
>> I don't know about common japanese practice and rationale for it, I
>> admit, but I'd say just go with Unicode and be done with it
>> forever. It's really hard to find browser that can't grok UTF-8
>> nowadays, and Unicode is the future, unlike all the broken,
>> complicated (ISO-2022) standards of the past.
>
> The problem with using a unicode based encoding is that the unicode
> character set unifies Chinese characters (without specifing the
> language) and people whose unicode font is a non-japanese one will
> likely get an ugly display (perhaps mixed fonts) of characters.

Yes, that is true, and it's also partially unsolvable within unicode,
partially fault of Japanese themselves not giving a jack when CJK Unicode
mapping was being defined (we've just had quite a lengthy discussion on
this very topic on IRC, but I'll spare you reading the backlog). It is
somewhat analogous to some European arguing for Caroline minuscule not
being supported in unicode -- it is and isn't at the same time, as Unicode
specifically excludes any appearance considerations. In fact, it's already
hacked around a bit, there are some special markers that are intended to
give a hint if text processed is Japanese or Chinese.

-- 
"Tautologizm to cos tautologicznego"
     Mathrick <mnews2@wp.pl>
      http://mathrick.org


Relevant Pages

  • =?utf-8?B?UmU6IM6lzqDOn86azpXOmc6czpXOnc6fzp0gKFNVQkpFQ1Qp?=
    ... See this article about a similar problem with Japanese ... Japanese Characters in the Subject Line of E-Mail Message Appear As ... > Subject line as the operating system fully support Unicode. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • Re: Need help with CABWIZ
    ... Just a slight clarification - Unicode doesn't simply include Japanese ... characters - it can support them, but having Unicode doesn't simply give you ... on what I've seen, it doesn't have provision for non ANSI characters, as the ...
    (microsoft.public.pocketpc.developer)
  • Re: Best encoding for a Japanese web site to deliver?
    ... >and the prophet was numbered among those called Timmy Douglas. ... >> The problem with using a unicode based encoding is that the unicode ... >> likely get an ugly display of characters. ... fostered by a few anti-Unicode Japanese people. ...
    (sci.lang.japan)
  • Re: Characters not being displayed when language settings are changed.
    ... occur only in Japanese fonts, and Unicode and non-Unicode CJK fonts ... I have been generating a report which contains characters like $B-!(B and $B-"(B. ... I tried this on my machine too by going to Regional and Language Settings ...
    (microsoft.public.word.docmanagement)
  • Re: Read UTF8 (mixed byte) file & convert to Unicode
    ... >> I have a file which has no BOM and contains mostly single byte chars. ... >> are numerous double byte chars (Japanese) which appear throughout. ... >> take the resulting Unicode and store it in a DB and display it onscreen. ... > string, ...
    (microsoft.public.dotnet.general)