Re: Arabic cursive in Unicode




Andreas Prilop wrote:
On 21 Nov 2006, Danny wrote:

I'm trying to draw up a table of Arabic cursive characters for a text
editor: I want to take the raw data and translate into a sequence of
cursive variants.

What do you mean by "cursive characters", "cursive variants"?

I mean that I need to take the 'actual' characters (in the 06 range)
and convert to the presentation forms (in the F range). This is in
order to make my own Arabic rendering engine (bypassing the OS)


An example: the letter Alef Maksura (0649) exists in an isolated and
final form at points FEEF and FEF0, but the initial and medial forms
are listed at FBE8 and FBE9 (with the complicated name ARABIC LETTER
UIGHUR KAZAKH KIRGHIZ ALEF MAKSURA INITIAL FORM). To me, this suggests
that standard Arabic includes only the first two forms, and the other
two only appear in a variant.

Never rely on the Unicode *names* for characters! They are never
changed and may be misleading. The most prominent example is the
"byte order mark" U+FEFF, which will be known forever as "zero-width
no-break space". Therefore, do not infer anything from the *name*
"alef maksura".

Sure - however, this is also defined as the initial form of character
0649, so the name matches the coding.


Second, do not rely on "compatibility characters" such as "Arabic
presentation forms". They exist mainly for compatibility with
older character sets. Never use them.

Huh? Now I'm confused. These presentation forms are the actual
characters displayed on screen when viewing Arabic, aren't they? If
not, where in the font are these presentation forms stored?

Anyway: fortunately in my case I'm including the font outlines in my
application, so I can work with the glyphs as they are stored in the
font; I don't need to worry about future-proofing.

Danny

.



Relevant Pages

  • The font "Arabic Transparent" (artro.ttf) is invalid
    ... This font, when installed, causes WEFT to crash. ... Error code Message Details ... I2100 Characters in a unicode range are present in the font, ...
    (microsoft.public.windowsxp.general)
  • Re: apostrophe with space in Word
    ... Helvetica as known in the computing world didn't get going until Apple included it in the Mac Operating system. ... Because it became so popular, Microsoft decide to add to their font collection, but because it was patented they decide to create their own version. ... So it may help to explain that the Unicode character set now defines about ... important to know the answer to the question "How many glyphs (characters) ...
    (microsoft.public.mac.office.word)
  • Re: Russian language support
    ... Actually in our previous test we didnt do language transition properly ... bytes, or if in Unicode, just a string of 2-byte values, each of which is ... it selects a font to use to display ... TTF fonts in Windows CE map Unicode characters into suitable glyphs. ...
    (microsoft.public.windowsce.platbuilder)
  • Re: How to set menu shortcuts, and fonts?
    ... So instead of appending a shortcut letter in parentheses I used ... In Vista this doesn't work. ... In Vista the title bar and menu bar are in Meiryo and are antialiased even at font size 9. ... (Somehow foreign versions of XP understand a default font of MS Gothic and they use a different font instead, which works if only Italian characters are used in the captions. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Russian language support
    ... The characters in Unicode are just values. ... Whether a given font that you ... A glyph is a drawing ...
    (microsoft.public.windowsce.platbuilder)