Quote should not terminate tokens

Haskell allows the single-quote character in identifiers, so you can use variable names with primes, like x'. This is so convenient (especially for naming “revised” versions of variables, which seems to happen a lot when assignment isn't available) that I've been missing it in Clojure recently.

There's no reason lisps can't have this. Common Lisp supports nonterminating macro characters — characters that have special meaning when they appear on their own, but not when they appear in the middle of a token. Like many of CL's features (generic functions, anyone?) this isn't used much in the standard library; by default there's only one nonterminating macro character, #, and that's only for historical reasons (compatibility with Maclisp, I think). But it's easy to make new ones, which solves the x' problem in one line:

CL-USER> (set-macro-character #\' (get-macro-character #\') t)
T
CL-USER> '(x x' y y')
(X |X'| Y |Y'|)

(T as the third argument of set-macro-character means “nonterminating”.) Note that quote still works. Symbols with primes print funny, because the printer doesn't realize nonterminating macro characters don't have to be escaped, but they work fine; you can name variables with primes to your heart's content.

This should be the default in any new lisp: ' should not terminate tokens. Neither should any macro character except ), ], } and ;, really — termination only matters when you leave out spaces between tokens, and who does that?

6 comments:

  1. I respectfully disagree. I think it's cleaner when

    'S

    is syntactically equivalent to

    (quote S)

    as it is in Scheme (at least in R3RS anyway) ' should therefore be as much a token terminator as (.

    The usual Lisp convention of using numbers seems just as good, if not better because of the conciseness.

    x'''' =?= x4

    ReplyDelete
  2. Clojure 1.3 lets you use single-quote characters in names.

    ReplyDelete
  3. Lisps usually use *, not ', which works just as well but plays nicer with ' as quote operator.

    Further, 1.3 adds some features in this direction. Certainly, at least, +' becomes legal.

    ReplyDelete
  4. Cool, Clojure 1.3 added this nine months ago! (Although the motivation seems to be providing names for alternative arithmetic operators because the normal ones no longer have implicit bignum overflow, which is alarming.) Maybe I should use the bleeding-edge version instead of the release.

    I don't really like * as a substitute for prime, because its other meanings distract me — especially the "sequential variant" meaning it has acquired from let* and do*.

    Losing universal equivalence between 'S and (quote S) isn't much of a problem, because it matters only in a context (missing space after a token) where S wouldn't necessarily read correctly anyway: (xS) doesn't read as (x S), so it's OK if (x'S) doesn't read correctly either. They remain equivalent in all sexprs produced by the printer, and all reasonable ones produced by humans. On the other hand, the same argument also implies left-paren could safely be nonterminating, but that seems scary to me (and I forgot to include left-parens in the list above).

    ReplyDelete
  5. Also the Clojure compiler reads Unicode, so you could hook up (for example) M-' to the actual Unicode "prime" character - I think it is a valid symbol character.

    ReplyDelete
  6. Besides #\# there is also #\. which is even stranger. It can be anywhere in a token *except* that a token containing only dots is illegal when not making up a proper dotted list.

    ReplyDelete

It's OK to comment on old posts.