April 2024 | Sheng Di, Jinyang Liu, Kai Zhao, Xin Liang, Robert Underwood, Zhaorui Zhang, Milan Shah, Yafan Huang, Jiajun Huang, Xiaodong Yu, Congrong Ren, Hanqi Guo, Grant Wilkins, Dingwen Tao, Jiannan Tian, Sian Jin, Zizhe Jian, Daoce Wang, Md Hasanur Rahman, Boyuan Zhang, Jon C. Calhoun, Guanpeng Li, Kazutomo Yoshii, Khalid Ayed Alharthi, Franck Cappello
A Survey on Error-Bounded Lossy Compression for Scientific Datasets
This paper provides a comprehensive survey of error-bounded lossy compression techniques for scientific datasets. Error-bounded lossy compression effectively reduces data storage and transfer costs while maintaining high data fidelity. The paper summarizes six classic compression models, surveys 10+ commonly used compression components, and presents 10+ state-of-the-art compressors. It also discusses 10+ modern scientific applications and use cases. The key contributions include a taxonomy of lossy compression models, a survey of compression components, a survey of compressors, and a survey of scientific applications. The paper also discusses the pros and cons of different compression models and techniques, and how they are used in different parallel and distributed use cases. The paper concludes with a discussion of future work.A Survey on Error-Bounded Lossy Compression for Scientific Datasets
This paper provides a comprehensive survey of error-bounded lossy compression techniques for scientific datasets. Error-bounded lossy compression effectively reduces data storage and transfer costs while maintaining high data fidelity. The paper summarizes six classic compression models, surveys 10+ commonly used compression components, and presents 10+ state-of-the-art compressors. It also discusses 10+ modern scientific applications and use cases. The key contributions include a taxonomy of lossy compression models, a survey of compression components, a survey of compressors, and a survey of scientific applications. The paper also discusses the pros and cons of different compression models and techniques, and how they are used in different parallel and distributed use cases. The paper concludes with a discussion of future work.