A Survey on Error-Bounded Lossy Compression for Scientific Datasets

A Survey on Error-Bounded Lossy Compression for Scientific Datasets

3 Apr 2024 | SHENG DI, JINYANG LIU, KAI ZHAO, XIN LIANG, ROBERT UNDERWOOD, ZHAORUI ZHANG, MILAN SHAH, YAFAN HUANG, JIAJUN HUANG, XIAODONG YU, CONGRONG REN, HANQI GUO, GRANT WILKINS, DINGWEN TAO, JIANNAN TIAN, SIAN JIN, ZIZHE JIAN, DAOCE WANG, MD HASANUR RAHMAN, BOYUAN ZHANG, JON C. CALHOUN, GUANPENG LI, KAZUTOMO YOSHII, KHALID AYED ALHARTH, FRANCK CAPPELLO
This paper provides a comprehensive survey of error-bounded lossy compression techniques for scientific datasets, addressing the significant challenges of data storage and transfer in scientific simulations and advanced instruments. The authors, from various institutions across the United States and China, outline the key contributions of their work, which include: 1. **Taxonomy of Lossy Compression Models**: The paper categorizes lossy compression into six classic models, each with distinct advantages and disadvantages in terms of time complexity and reconstruction quality. 2. **Survey of Compression Components**: It details over 10 commonly used compression components and modules, such as predictors, bit truncation, quantization, wavelet transform, Tucker decomposition, and autoencoders. 3. **State-of-the-Art Compressors**: The paper reviews over 30 state-of-the-art error-controlled lossy compressors, analyzing how they combine various compression modules in their designs. 4. **Application of Lossy Compression**: It discusses the application of lossy compression in over 10 modern scientific applications and distributed use cases. The authors emphasize the importance of error-bounded lossy compression in reducing storage and memory footprints, accelerating simulations, and improving I/O performance. They also highlight the potential of deep learning techniques in achieving very high compression ratios while maintaining acceptable reconstruction quality. The paper is structured into several sections, covering compression model taxonomies, modular lossy compression techniques, general-purpose compressors, and customized compressors for specific applications.This paper provides a comprehensive survey of error-bounded lossy compression techniques for scientific datasets, addressing the significant challenges of data storage and transfer in scientific simulations and advanced instruments. The authors, from various institutions across the United States and China, outline the key contributions of their work, which include: 1. **Taxonomy of Lossy Compression Models**: The paper categorizes lossy compression into six classic models, each with distinct advantages and disadvantages in terms of time complexity and reconstruction quality. 2. **Survey of Compression Components**: It details over 10 commonly used compression components and modules, such as predictors, bit truncation, quantization, wavelet transform, Tucker decomposition, and autoencoders. 3. **State-of-the-Art Compressors**: The paper reviews over 30 state-of-the-art error-controlled lossy compressors, analyzing how they combine various compression modules in their designs. 4. **Application of Lossy Compression**: It discusses the application of lossy compression in over 10 modern scientific applications and distributed use cases. The authors emphasize the importance of error-bounded lossy compression in reducing storage and memory footprints, accelerating simulations, and improving I/O performance. They also highlight the potential of deep learning techniques in achieving very high compression ratios while maintaining acceptable reconstruction quality. The paper is structured into several sections, covering compression model taxonomies, modular lossy compression techniques, general-purpose compressors, and customized compressors for specific applications.
Reach us at info@study.space
[slides and audio] A Survey on Error-Bounded Lossy Compression for Scientific Datasets