Re: How to change OE so the coding is right automatically



Paul Blay wrote:

Ah, found them.
news:pKqdnVBMivd2aSrZnZ2dnUVZ_rOdnZ2d@xxxxxxxxxxxx
That is in Unicode(UTF-7). As Jim would probably say, don't use
UTF-7 - it sucks.

I agree. That's why I didn't even think of using it until someone told
me he saw the Japanese text using UTF-7.

"I guess they use UTF-7 because it supports the older browsers."
"They" only use UTF-7 when you select UTF-7 (or sometimes when
you reply to a post in UTF-7). Your posts in misc.test were in
"iso-2022-jp" / Japanese(JIS) which is pretty much the default.

I'd kind of expected the technology would have advanced to the point that
everything would be handled automatically.

Newsreader software is in rather a dead-end of software development.
OE itself has 'advanced' from dangerously insecure and ludicrously buggy
to not overly secure and somewhat buggy but they have made no real
effort to improve its capabilities.

Yesterday, I downloaded something called 40tude Dialog at
http://www.40tude.com/dialog/. It boasts sophisticated unicode
features. I've played around with it a little. It does what it says
it will do. However, because it is so sophisticated, I thought I'd
give it a thorough work-out at misc.test before I unleash potential
unforeseen phenomena on real newsgroups.

The problem seems to be at its worst when I create the Japanese text
directly in OE. If I use a word processor and cut and paste into OE
the problem is less severe.

That cut and paste should be different from direct input is more than
slightly odd. Given the way OE works the content should be byte-by-byte
identical whichever way you get the text in there.

That's what I would have thought.

I had no problems reading a response I'd created that way to somebody else's post.
However, he was not able to automatically decode the text. He finally did so using
ISO-2022-JP. However, I had created it in UTF-8 and when I switched to
ISO-2022-JP all of the Japanese and some of the English became garbled.

By "created in UTF-8" do you mean "Selected Unicode(UTF-8) as the format before
posting" ?

I'd really meant to use the word processor's feature to save a document
as plain text with UTF-8 encoding.

I have a related question that perhaps you can answer. You saw my post
in response to Ueshiba-san. It was the first time I had posted
complete Japanese paragraphs to sci.lang.jap. Because there are no
spaces between Japanese characters, neither Google Groups nor 40tude
Dialog broke up the lines. I thought maybe it would occur during
posting so I did a couple of test runs on misc.test. Each paragraph
still remained as a single long horizontal line. I finally forced in
manual line breaks before I sent it.

BTW, when I did my tests, 「森の王」 came out perfectly in each of
them. I have no idea how it got garbled when I posted to s.l.j. It's
a hard enough struggle for me to compose the text to begin with, I
can't afford afford to be distracted by software gliches.

Phil Yff

.



Relevant Pages

  • Re: How to change OE so the coding is right automatically
    ... UTF-7 - it sucks. ... as plain text with UTF-8 encoding. ... complete Japanese paragraphs to sci.lang.jap. ... Google Groups said I wasn't allowed to post ...
    (sci.lang.japan)
  • Re: How to change OE so the coding is right automatically
    ... me he saw the Japanese text using UTF-7. ... complete Japanese paragraphs to sci.lang.jap. ... The two most likely things to need checking are preferences / settings in Google groups and the encoding your browser is using when editing the post. ... 'standard' set Google may arbitrarily decide to encode it as Chinese Big-5. ...
    (sci.lang.japan)