Purely algebraic abstractions

Any abstraction expresses something all its instances share. Usually this is semantic: all instances have some common meaning. They represent the same values, or perform the same computation, or support the same operations. When you know a construct is an instance of a certain abstraction, you know something about what it means.

Some abstractions are different. Their instances have no meaning in common, only algebraic properties. These are purely algebraic abstractions. Such an abstraction tells what transformations are valid on its instances' expressions, but says nothing about what they mean.

The classic algebraic abstractions are (of course) those of abstract algebra: groups and rings and fields and such. They abstract the properties necessary for algebraic transformations, and nothing else. If you know that your objects and their operators form a ring, you can manipulate formulae and even prove theorems about them, without knowing anything about what they mean.

In contrast, most abstractions in computing focus on meaning, and express algebraic properties only incidentally. If you have a java.util.Collection or a java.util.Map, you know you can add and remove items, test whether they're there, and iterate over them — but do you know any algebraic properties? Even the most basic properties are broken by unusual collections like caches or Bloom filters. They're semantically legitimate collections, and their algebraic properties are unreliable because they're irrelevant.

(Algebraic abstractions are not entirely reliable either, because most of their computational incarnations don't quite satisfy their axioms. Reflection and other debugging features can often detect differences between supposedly equivalent objects. Limitations of memory and time create edge cases where expressions equivalent in denotation are different in operation. Floating-point arithmetic makes a sport of breaking nearly every algebraic property that could be expected of it. But similar problems afflict nearly all attempts to reason about abstractions; they're not specific to algebraic abstractions, and they don't make them useless. The equivalences generally hold for the properties we care to preserve, so they're correct in practice though not in theory.)

Haskell typeclasses

Most Haskell typeclasses have semantic content: Show and Num are about operations with the same meaning for all their instances; Eq expects algebraic properties (reflexivity and transitivity) but still defines the meaning of ==. There are a small but increasing number of purely algebraic typeclasses: the Prelude has Monoid, Functor, Applicative and Monad, whose instances have nothing in common but algebraic equivalences.

This is why monads are so hard to learn. Each student of Haskell asks what monads mean, and invents a variety of wrong answers (typically semantic generalizations of IO actions), because they're sure that such an important abstraction must be meaningful, and have never heard of algebraic abstraction. Eventually they learn to use monads without asking what they mean, because monads don't mean anything.

This is a sore point among Haskellers. It will get more sore, because Haskell is gaining more algebraic abstractions. Applicative is in the Prelude now!

Haskell Prime numbers

Haskell Prime is a collection of ideas for future versions of Haskell, including a proposal to generalize the numeric typeclasses by removing their semantic content, replacing Num and most of its subclasses with purely algebraic classes:

(+) :: AbelianGroup a ⇒ a → a → a
(*) :: Ring a ⇒ a → a → a
(/) :: DivisionRing a ⇒ a → a → a
mod :: EuclideanDomain a ⇒ a → a → a

(A division ring is like a field except that multiplication is not necessarily commutative.)

This makes the numeric operations maximally general, at the cost of making them meaningless. It also gives mundane code types (and type errors) that make sense to mathematicians and no one else:

factorial :: (Ring a, Ord a) ⇒ a → a
sum :: AbelianGroup a ⇒ [a] → a

(I'm not sure why + is on AbelianGroup instead of something more general like Magma. Maybe it's to comply with users' expectation that + be associative and commutative.)

This proposal brings the straightforward clarity of Monad to arithmetic, an area where Haskell has long suffered from comprehensibility bordering on practicality.

I'm not sure algebraic abstractions are always a bad idea for programming languages, but the difficulty of Monad suggests they're hazardous.

A semantic abstraction tells you what its instances mean. An algebraic abstraction only tells you what transformations preserve that meaning. That's enough for optimization, but not for understanding.

4 comments:

  1. The correct term for communicating purely algebraic abstractions like monads to software people (rather than maths people) isn't abstraction; it's design pattern. That gets people into the right frame of mind, away from specific applications, and out into meta talk about the shape of a design. The fact that design patterns have cropped up in two different guises in different approaches to computation suggests that there's something there, and that it's probably worth formalizing.

    ReplyDelete
  2. Are you sticking with such low level concepts as division rings to pander to the hacking instincts of programmers? Obviously a properly grounded mathematical framing in terms of homology theory, topos, and Grothendeik primes would be far more rigorous. If we treat e.g. Haskell maps as frictionless points in co-functor generalizations of Jordon-Holder, we could put the whole enterprise of programming on a solid footing.

    ReplyDelete
  3. Well they need to make three versions of Haskell, one to pander to mathmeticians & category theorists, one for non-programmers, and one for perl hackers.

    ReplyDelete
  4. Algebra is more than dumb symbol pushing, and algebraic abstractions have (often geometric!) meaning. For example, the meaning of a group is its various group actions on spaces, the meaning of a commutative ring is its prime spectrum, and the meaning of an R-module is the étale space of the corresponding quasicoherent Spec(R)-module.

    As for “why not Magma?”, well, generality for the sake of generality is never the goal. Magma is general enough that any type with any binary operation is a valid instance of it, but the counterpart is that knowing that a binary operation is the binary operation of a magma tell you nothing! However, knowing that a binary operation is the binary operation of an Abelian group gives you, in exchange for a modest amount of proving effort, a wealth of legitimate meaning-preserving program transformations.

    ReplyDelete

It's OK to comment on old posts.