The Input/Output Complexity of Sorting and Related Problems

The Input/Output Complexity of Sorting and Related Problems

September 1988 | ALOK AGGARWAL and JEFFREY SCOTT VITTER
This paper presents tight upper and lower bounds for the number of I/Os required for five sorting-related problems: sorting, FFT, permutation networks, permuting, and matrix transposition. The bounds are valid in both worst-case and average-case scenarios, and in several cases, the constant factors match. The secondary storage is modeled as a magnetic disk capable of transferring P blocks of B records per time unit. The paper introduces two optimal algorithms for these problems, which are variants of merge sorting and distribution sorting. For P = 1, the standard merge sorting algorithm is shown to be optimal for external sorting. The sorting algorithms use the same number of I/Os as the permutation phase of key sorting, except when the internal memory is extremely small. The paper also provides a simpler derivation of Hong and Kung's lower bound for the FFT when B = P = O(1). The paper defines the five problems and presents three main theorems that characterize the I/O complexity for these problems. The first theorem provides bounds for sorting and FFT, the second for permuting, and the third for matrix transposition. The theorems show that the I/O complexity depends on the parameters N (number of records), M (internal memory size), B (block size), and P (number of concurrent block transfers). The bounds are derived using a combination of lower bound proofs and upper bound algorithms. The paper also discusses the implications of these results for the performance of sorting algorithms and the relationship between the five problems. The results show that the permutation phase of key sorting typically requires as many I/Os as general sorting. The paper also provides a detailed proof of the lower bounds for the FFT and permutation networks, and discusses the implications of these results for the design of efficient sorting algorithms.This paper presents tight upper and lower bounds for the number of I/Os required for five sorting-related problems: sorting, FFT, permutation networks, permuting, and matrix transposition. The bounds are valid in both worst-case and average-case scenarios, and in several cases, the constant factors match. The secondary storage is modeled as a magnetic disk capable of transferring P blocks of B records per time unit. The paper introduces two optimal algorithms for these problems, which are variants of merge sorting and distribution sorting. For P = 1, the standard merge sorting algorithm is shown to be optimal for external sorting. The sorting algorithms use the same number of I/Os as the permutation phase of key sorting, except when the internal memory is extremely small. The paper also provides a simpler derivation of Hong and Kung's lower bound for the FFT when B = P = O(1). The paper defines the five problems and presents three main theorems that characterize the I/O complexity for these problems. The first theorem provides bounds for sorting and FFT, the second for permuting, and the third for matrix transposition. The theorems show that the I/O complexity depends on the parameters N (number of records), M (internal memory size), B (block size), and P (number of concurrent block transfers). The bounds are derived using a combination of lower bound proofs and upper bound algorithms. The paper also discusses the implications of these results for the performance of sorting algorithms and the relationship between the five problems. The results show that the permutation phase of key sorting typically requires as many I/Os as general sorting. The paper also provides a detailed proof of the lower bounds for the FFT and permutation networks, and discusses the implications of these results for the design of efficient sorting algorithms.
Reach us at info@study.space