The three ingredients to making performance engineering easier and more fun

Given two "alternatives" X and Y, your next question makes all the difference.

Jul 03, 2025

Mike Spear (Lehigh) gave a great talk at June’s Fastcode Seminar about a renewed focus on the structure of concurrent data structures. He began by acknowledging the overlap between his talk and the previous seminar by Guy Blelloch, which was also about concurrent data structures.

Separating mechanism from policy

In an earlier post, I promoted Mike’s seminar as a talk about “separating mechanism from policy.” So I was surprised when Mike talked about software transactional memory (STM) without mentioning “separating” or “policy”—two key words in the title of the paper behind his talk!

I shared my surprise with Mike, and he explained:

Regarding that paper, “separating policy from mechanism” is an idea that goes back to the 70s, where the “policies” usually have to do with resource allocation. The main idea is that the low-level techniques for determining who has authorization to use a hardware resource can be decoupled from the high-level decisions about how they use that resource. As an example, you could imagine that there is a mechanism that gives me the CPU from time to time, that takes the CPU from me, and that lets me yield the CPU voluntarily. The high level policy for deciding when to give me the CPU and when to take it should be decoupled from the low-level mechanism. This would let the OS give one program frequent, small slices so it can keep an audio stream running, while giving another program infrequent but big slices so it can do a batch computation that would benefit from cache locality.
It is the case that our formulation of exoTM allows for STM, STMCAS, and locking algorithms to operate simultaneously, all using the low-level synchronization afforded by exoTM. So it felt to us like a neat analogy to “separation of policy and mechanism.” Sadly, across many conversations, I've found that nobody really likes this explanation. So in the end, I think the best way to think about it is that we split TM into two parts: a low-level synchronization component, and a high-level API. Then we created other APIs that are compatible with the low-level synchronization component.

Mike’s experience reminds me of theorist Robin Cockett, who wrote:

Returning to Mike’s actual talk, the key point is that recent innovations in synchronization enable programmers to stop worrying about the limitations of their synchronization mechanism and return their attention to where it belongs: the structure of concurrent data structures. You can listen to Mike’s recorded presentation here.

Task-parallel technology for software performance engineering

IEEE-HPEC is a virtual conference happening September 15-19, and I am organizing a special session, Fastcode@HPEC, about task-parallel technology for software performance engineering. The session features talks by key contributors to several leading platforms for SPE.

Tsung-Wei Huang (U. Wisconsin): Taskflow
Hartmut Kaiser (LSU): HPX
I-Ting Angelina Lee (WUSTL): OpenCilk
Tim Mattson: OpenMP

We invite you to attend our virtual session. We promise to give you guidance for asking “how does X compare to Y?” And we promise to help you think about “how do X, Y, and Z make my job easier and more fun.” Click here to learn more, and we hope to see you at our virtual session in September!

The three ingredients to making performance engineering easier and more fun

Given two "alternatives" X and Y, your next question makes all the difference.

Separating mechanism from policy

Task-parallel technology for software performance engineering

Discussion about this post