June 15, 2008
Threads are Evil

The Magician of the Ivory Tower brought his latest invention for the master programmer to examine.

I followed a link labeled "Threads are evil" to this paper (pdf, few hundred k) by the Edward Lee, who is no less than the chair of the EECS department at the University of California, Berkeley.

The thrust of the paper (which is more than two years old) is that threads are a bad model for parallel execution because they make it hard to prove things and reason about programs. I'm not trying to make him sound unreasonable, but there it is. Actual excerpt:

That is, given a pair of multithreaded programs (p1, p2) and another pair (p1', p2'), when are these two pairs equivalent? A reasonable extension of the basic theory defines them to be equivalent if all interleavings halt for the same initial state and yield the same final state. The enormous number of possible interleavings makes it extremely difficult to reason about such equivalence except in trivial cases (where, for example, the state B** is partitioned so that the two programs are unaffected by each others’ partition)

Gotta love the theo-comp notation -- B** having been previously defined, you understand, as the union of B* and Bω. Anyway, as the paper unfolds, it turns out that the good doctor Lee believes that the cure is moving towards a different set of abstractions for concurrency -- expressing the meeting points between concurrent computations using "coordination languages", orthogonal to programming languages. A few examples (and a notation) are presented in the text of the paper, where we learn that Dr. Lee's group has worked for at least a decade on a system that models concurrency and on a graphical notation for expressing coordination.

It would be easier to accept what the ivory-tower crowd want to say if there didn't seem to be a profound conflict of interest at work. I'm not saying that a (self-described) concurrency expert shouldn't make pronouncements about concurrency. But he shouldn't be surprised when ordinary grunt programmers ignore his advice. And lamenting the penetration of academic languages like E and Erlang is, well, just silly.

The middle of the paper is spent giving an example of the Observer pattern in Java, and showing some of the problems that can occur when moving to a multithreaded system. But the example is not convincing -- or at least, I don't see why I would necessarily design a system where the listener routines perform complex computations and acquire locks in the trigger's thread, which is where the concern about deadlocks arises from. If deadlocks were a problem, I'd switch to a message-based architecture where the only thing done inside mutexes is list manipulation.

What's worse, his proposed solution is not convincing either -- he draws a pretty colored diagram and explains that in the coordination language of Ptolemy II, the Observer pattern is handled by a Rendezvous directory. Well, OK, but since he has defined the two value producers, value consumer, and observer as four separate otherwise noninteracting threads, I don't see where the deadlock risk arises from.

It's unfortunate because the message he is trying to deliver -- threads are hard -- is a good one, and one that needs to be heard and handled before the concurrency problems he describes become more commonplace. (I estimate, when quad-processor chips are standard on desktops, which will be around 2015 by my guess.) But expecting a significant fraction of the world's programmers to transform their mental models of programming and computation to accept a new symbolism is ... unlikely to be the solution.

Posted by Sam at 07:59 PM