February, 1991 | JOHN M. MELLOR-CRUMMEY and MICHAEL L. SCOTT
This paper addresses the issue of synchronization contention in shared-memory multiprocessors, particularly focusing on spin locks and barriers. The authors argue that traditional busy-wait synchronization techniques, which often lead to significant memory and interconnect contention, can be improved to avoid these issues. They propose a new scalable algorithm for spin locks that generates constant remote references per lock acquisition, independent of the number of processors. This algorithm requires only a constant amount of space per lock and does not need special hardware support beyond basic atomic operations. Additionally, they present a scalable barrier algorithm that also generates constant remote references per processor reaching the barrier. The paper compares the performance of these scalable algorithms with other software approaches on both a Sequent Symmetry and a BBN Butterfly, concluding that synchronization contention need not be a significant performance bottleneck in large-scale shared-memory multiprocessors. The authors also discuss the implications of their findings for the design of special-purpose synchronization hardware and "dance hall" architectures.This paper addresses the issue of synchronization contention in shared-memory multiprocessors, particularly focusing on spin locks and barriers. The authors argue that traditional busy-wait synchronization techniques, which often lead to significant memory and interconnect contention, can be improved to avoid these issues. They propose a new scalable algorithm for spin locks that generates constant remote references per lock acquisition, independent of the number of processors. This algorithm requires only a constant amount of space per lock and does not need special hardware support beyond basic atomic operations. Additionally, they present a scalable barrier algorithm that also generates constant remote references per processor reaching the barrier. The paper compares the performance of these scalable algorithms with other software approaches on both a Sequent Symmetry and a BBN Butterfly, concluding that synchronization contention need not be a significant performance bottleneck in large-scale shared-memory multiprocessors. The authors also discuss the implications of their findings for the design of special-purpose synchronization hardware and "dance hall" architectures.