Re: Arabic cursive in Unicode




Danny wrote:
Andreas Prilop wrote:
On 21 Nov 2006, Danny wrote:

I'm trying to draw up a table of Arabic cursive characters for a text
editor: I want to take the raw data and translate into a sequence of
cursive variants.

What do you mean by "cursive characters", "cursive variants"?

I mean that I need to take the 'actual' characters (in the 06 range)
and convert to the presentation forms (in the F range). This is in
order to make my own Arabic rendering engine (bypassing the OS)

What do you mean by "presentation forms"? All the "isolated" shapes of
letters can occur within or at the end of words -- the "four" shapes of
Arabic letters exist only because some letters can connect with letters
on both sides, and some can connect only with letters before them (to
their right).

Note that, back in Apple's Worldscript I, which was how you typed
Hebrew and Arabic back in System 7, the standard Arabic font (which
fitted into 256 characters minus control characters) accommodated
Arabic, Persian, and Urdu (certainly the three most used Arabic-written
languages) -- but a few of the vowel points were omitted, such as the
dagger alif and the wasla. It did the contextual forms automatically
and even handled the most common ligatures.

Then someone came up with the predecessor of Open Type, which was
called GX, which could do a very good job of imitating a handwritten
text (so many ligatures included), but the Arabics that come with
Windows these days have regressed.

An example: the letter Alef Maksura (0649) exists in an isolated and
final form at points FEEF and FEF0, but the initial and medial forms
are listed at FBE8 and FBE9 (with the complicated name ARABIC LETTER
UIGHUR KAZAKH KIRGHIZ ALEF MAKSURA INITIAL FORM). To me, this suggests
that standard Arabic includes only the first two forms, and the other
two only appear in a variant.

Never rely on the Unicode *names* for characters! They are never
changed and may be misleading. The most prominent example is the
"byte order mark" U+FEFF, which will be known forever as "zero-width
no-break space". Therefore, do not infer anything from the *name*
"alef maksura".

Sure - however, this is also defined as the initial form of character
0649, so the name matches the coding.


Second, do not rely on "compatibility characters" such as "Arabic
presentation forms". They exist mainly for compatibility with
older character sets. Never use them.

Huh? Now I'm confused. These presentation forms are the actual
characters displayed on screen when viewing Arabic, aren't they? If
not, where in the font are these presentation forms stored?

Anyway: fortunately in my case I'm including the font outlines in my
application, so I can work with the glyphs as they are stored in the
font; I don't need to worry about future-proofing.

.



Relevant Pages

  • Re: custom key commands
    ... F&R is a funny beast with a mind of its own; it doesn't always work as you might expect with non-ascii characters but sometimes it does what you want without quite managing to show it correctly. ... window, and then it's fine. ... (One of the letters that insists on its own font -- sometimes Times ...
    (microsoft.public.word.docmanagement)
  • Re: Arabic and other languages
    ... > Arabic keyboard mapping and Word will insert the characters and words ... > long-hand connects the letters but doesn't change their shape much by ... > Arabic range of a font like Arial using the Mac's Character Palette). ... > with character variants. ...
    (microsoft.public.mac.office.word)
  • Re: custom key commands
    ... might expect with non-ascii characters but sometimes it does what you want ... Frinstance, I get all my macronned letters with Ctrl+Alt+hyphen, ... window, and then it's fine. ... (One of the letters that insists on its own font -- sometimes Times ...
    (microsoft.public.word.docmanagement)
  • Re: Arabic and other languages
    ... Arabic keyboard mapping and Word will insert the characters and words ... long-hand connects the letters but doesn't change their shape much by ... rendering is done through ATSUI. ... Arabic range of a font like Arial using the Mac's Character Palette). ...
    (microsoft.public.mac.office.word)
  • Re: Arabic cursive in Unicode
    ... cursive variants. ... Arabic letters exist only because some letters can connect with letters ... and even handled the most common ligatures. ... but the Arabics that come with ...
    (sci.lang)