Understanding A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling

This paper presents a fully asynchronous multifrontal solver, MUMPS, designed for direct solution of sparse linear systems on distributed memory computers. The solver is part of the PARASOL project, which aims to develop a portable library for solving large sparse systems of equations. MUMPS is particularly designed to handle a wide range of problems, including symmetric, unsymmetric, and rank-deficient matrices, and supports various input formats. The solver achieves high performance by exploiting parallelism from both sparsity and dense matrix operations, using dynamic distributed task scheduling to accommodate numerical pivoting and task migration to lightly loaded processors. The paper discusses the design choices, including the use of classical partial numerical pivoting, automatic adaptation to load variations, and the ability to solve a broad class of problems. Experimental results on a Cray SGI Origin 2000 and an IBM SP2 demonstrate the solver's effectiveness, with detailed performance analysis and comparisons of different ordering strategies. The paper also explores the impact of matrix format (assembled vs. elemental) on performance and memory scalability, and introduces dynamic scheduling strategies to improve load balancing and efficiency.This paper presents a fully asynchronous multifrontal solver, MUMPS, designed for direct solution of sparse linear systems on distributed memory computers. The solver is part of the PARASOL project, which aims to develop a portable library for solving large sparse systems of equations. MUMPS is particularly designed to handle a wide range of problems, including symmetric, unsymmetric, and rank-deficient matrices, and supports various input formats. The solver achieves high performance by exploiting parallelism from both sparsity and dense matrix operations, using dynamic distributed task scheduling to accommodate numerical pivoting and task migration to lightly loaded processors. The paper discusses the design choices, including the use of classical partial numerical pivoting, automatic adaptation to load variations, and the ability to solve a broad class of problems. Experimental results on a Cray SGI Origin 2000 and an IBM SP2 demonstrate the solver's effectiveness, with detailed performance analysis and comparisons of different ordering strategies. The paper also explores the impact of matrix format (assembled vs. elemental) on performance and memory scalability, and introduces dynamic scheduling strategies to improve load balancing and efficiency.

A fully asynchronous multifrontal solver using distributed dynamic scheduling

June 30, 1999 | Patrick R. Amestoy, Iain S. Duff, Jean-Yves L'Excellent, Jacko Koster