Re: Why (or not) use single letter variable names?

From: Jallan (jallan_at_smrtytrek.com)
Date: 12/23/04


Date: 23 Dec 2004 10:28:14 -0800

Herman Rubin wrote:

> If you can recognize the letters, you could read
> mathematics written with them as symbols, even if you
> did not know any of the names. I could do this with
> Russian letters, and I do not know their names.

and

> As I said, I do not need to name them. I would not
> have any problem with using Russian letters for
> symbols, but I would have a much greater problem
> if the TeX representations for them were used.

The question of which characters ought to be used for creating variable
identifiers has been a Unicode issue for some time and a Unicode
technical document treating this issue has been several times updated.

Oddities appear, such as which do not correspond between uppercase and
lower case, which mean in a computer language which is
case-insensitive, _ß_ in lowercase should match with _SS_ in
uppercase which means also that _ß_ in lowercase must match with _ss_
in lowercase: variables fooß, FOOSS, and fooss (and FoOsS and so
forth) should be treated as identical.

The current technical annex covering this appears at
http://www.unicode.org/reports/tr31/tr31-3.html . It reads in part:

<< The formal syntax provided here is intended to capture the general
intent that an identifier consists of a string of characters that
begins with a letter or an ideograph, and then includes any number of
letters, ideographs, digits, or underscores. Each programming language
standard has its own identifier syntax; different programming languages
have different conventions for the use of certain characters from the
ASCII range ($, @, #, _) in identifiers. To extend such a syntax to
cover the full behavior of a Unicode implementation, implementers need
only combine these specific rules with the syntax provided here. >>

Using full Unicode letters and ideographs in c and similar languages
could be used to make such languages even more difficult to readm, even
for programmers who recognize non-Latin letters and ideographs, as
letter character U+01C0 LATIN LETER DENTAL CLICK is identical in
appearance to the ASCII | symbol and U+01C3 LATIN LETTER RETROFLEX
CLICK is identical in appearance to the ASCII ! symbol. Also many
uppercase Greek and Cyrillic characters are identical in appearance to
particular Latin characters and to each other. The occasional confusion
now between 1 and l (one and "el") in reading code would be greatly
extended.

Jallan



Relevant Pages

  • Re: Latin 2 Fonts for Czech Document
    ... Michael Harding wrote: ... the languages and the characters are in!Chars. ... I think I'll stick to languages rather than arithmetic. ... I've just counted the letters in my dictionaries. ...
    (comp.sys.acorn.apps)
  • Re: excel: count uppercase letters in a cell
    ... all characters not only the characters of the English alphabet ... cells s/he would be checking would include ONLY letters? ... Hawaiian and perhaps all Polynesian languages). ...
    (microsoft.public.excel.worksheet.functions)
  • Re: International Characters
    ... specifically "International Characters." ... Do you want to type ordinary Western European languages with accented ... languages with less common accented letters? ...
    (microsoft.public.word.docmanagement)
  • Re: [OT] I love that writing style. (Was: Re: Is this Regular Expression for UTF-8 Correct??)
    ... consider if the following were a specifcation for a C identifier: ... characters from being used for Identifiers. ... sense from the perspective of natural language. ... to use accented characters is a similar burden. ...
    (microsoft.public.vc.mfc)
  • Re: User Accounts
    ... Change the name of the account. ... hackers as a means of getting a foothold into your system. ... using all upper case or all lower case letters. ... It should contain at least eight characters. ...
    (microsoft.public.windowsxp.help_and_support)