[slides] Parallel database systems%3A the future of high performance database systems

The passage discusses the evolution and success of parallel database systems, which have become more than just research curiosities. Initially, specialized hardware like CCD memories and optical disks were explored, but they failed to meet expectations. However, the widespread adoption of the relational data model in the late 1980s, combined with the suitability of relational queries for parallel execution, led to the development of highly parallel machines by companies like Teradata, Tandem, and startups. These systems leverage conventional processors, memories, and disks, and have emerged as major consumers of highly parallel architectures. The key to the success of these systems is the shared-nothing architecture, where processors communicate via message-based client-server operating systems and high-speed networks. This design minimizes interference and allows for incremental growth, making it scalable to hundreds or even thousands of processors. The article also discusses the challenges and solutions for achieving linear speedup and scaleup, including data partitioning, pipelined and partitioned parallelism, and specialized parallel relational operators. Several systems are highlighted, including Teradata, Tandem NonStop SQL, Gamma, and Bubba, each demonstrating near-linear speedup and scaleup on relational queries and transaction processing workloads. The Super Database Computer (SDC) project at the University of Tokyo is noted for its unique hardware and software approach, while other prototypes like XPRS, Volcano, and Arbre are also mentioned. The article concludes by comparing the performance and price of these parallel database systems to traditional mainframes, challenging Grosch's law, which suggests that larger machines are more cost-effective.The passage discusses the evolution and success of parallel database systems, which have become more than just research curiosities. Initially, specialized hardware like CCD memories and optical disks were explored, but they failed to meet expectations. However, the widespread adoption of the relational data model in the late 1980s, combined with the suitability of relational queries for parallel execution, led to the development of highly parallel machines by companies like Teradata, Tandem, and startups. These systems leverage conventional processors, memories, and disks, and have emerged as major consumers of highly parallel architectures. The key to the success of these systems is the shared-nothing architecture, where processors communicate via message-based client-server operating systems and high-speed networks. This design minimizes interference and allows for incremental growth, making it scalable to hundreds or even thousands of processors. The article also discusses the challenges and solutions for achieving linear speedup and scaleup, including data partitioning, pipelined and partitioned parallelism, and specialized parallel relational operators. Several systems are highlighted, including Teradata, Tandem NonStop SQL, Gamma, and Bubba, each demonstrating near-linear speedup and scaleup on relational queries and transaction processing workloads. The Super Database Computer (SDC) project at the University of Tokyo is noted for its unique hardware and software approach, while other prototypes like XPRS, Volcano, and Arbre are also mentioned. The article concludes by comparing the performance and price of these parallel database systems to traditional mainframes, challenging Grosch's law, which suggests that larger machines are more cost-effective.

Parallel Database Systems: The Future of High Performance Database Systems

June 1992/Vol.35, No.6 | David DeWitt and Jim Gray