15 November 2016 | Dari Kimanius, Björn O Forsberg, Sjors HW Scheres, Erik Lindahl
This paper presents an accelerated cryo-EM structure determination method using GPU parallelization in RELION-2. The implementation significantly improves the performance of key steps in the cryo-EM workflow, including image classification, high-resolution refinement, and template-based particle selection. The use of GPUs reduces memory requirements and enables high-resolution cryo-EM structure determination in days on a single workstation, rather than relying on large clusters. The algorithm uses CUDA for GPU programming and leverages libraries such as cuFFT and CUB/thrust for efficient computation. The implementation also uses single-precision arithmetic, which does not compromise the resolution of the final structures. The regularized likelihood optimization algorithm, which is computationally intensive, is accelerated by treating multiple reference maps, translations, and orientations as parallel tasks. This allows for efficient processing on GPUs, reducing the computational time required for cryo-EM structure determination. The implementation also includes a semi-automated particle picking algorithm that uses GPU acceleration for template-based selection. The results show that the GPU-accelerated version of RELION-2 can process large datasets much faster than the CPU version, with performance gains of up to two orders of magnitude. The paper also discusses the impact of low-pass filtering on micrographs, which reduces the size of FFT grids and subsequent computations, leading to significant acceleration without loss of quality. The implementation enables high-resolution cryo-EM structure determination in a matter of days on a single workstation, making it more accessible to researchers without access to large computing clusters. The paper also highlights the benefits of using single-precision arithmetic, which reduces memory requirements and improves performance on GPUs. The results demonstrate that the GPU-accelerated version of RELION-2 can achieve high-resolution structures with comparable accuracy to the CPU version, while significantly reducing computational time and resource requirements. The implementation also includes a complete workflow for the β-galactosidase dataset, showing the effectiveness of the GPU-accelerated approach in real-world applications. The paper concludes that the GPU implementation of RELION-2 represents a significant advancement in cryo-EM structure determination, enabling faster and more efficient processing of large datasets.This paper presents an accelerated cryo-EM structure determination method using GPU parallelization in RELION-2. The implementation significantly improves the performance of key steps in the cryo-EM workflow, including image classification, high-resolution refinement, and template-based particle selection. The use of GPUs reduces memory requirements and enables high-resolution cryo-EM structure determination in days on a single workstation, rather than relying on large clusters. The algorithm uses CUDA for GPU programming and leverages libraries such as cuFFT and CUB/thrust for efficient computation. The implementation also uses single-precision arithmetic, which does not compromise the resolution of the final structures. The regularized likelihood optimization algorithm, which is computationally intensive, is accelerated by treating multiple reference maps, translations, and orientations as parallel tasks. This allows for efficient processing on GPUs, reducing the computational time required for cryo-EM structure determination. The implementation also includes a semi-automated particle picking algorithm that uses GPU acceleration for template-based selection. The results show that the GPU-accelerated version of RELION-2 can process large datasets much faster than the CPU version, with performance gains of up to two orders of magnitude. The paper also discusses the impact of low-pass filtering on micrographs, which reduces the size of FFT grids and subsequent computations, leading to significant acceleration without loss of quality. The implementation enables high-resolution cryo-EM structure determination in a matter of days on a single workstation, making it more accessible to researchers without access to large computing clusters. The paper also highlights the benefits of using single-precision arithmetic, which reduces memory requirements and improves performance on GPUs. The results demonstrate that the GPU-accelerated version of RELION-2 can achieve high-resolution structures with comparable accuracy to the CPU version, while significantly reducing computational time and resource requirements. The implementation also includes a complete workflow for the β-galactosidase dataset, showing the effectiveness of the GPU-accelerated approach in real-world applications. The paper concludes that the GPU implementation of RELION-2 represents a significant advancement in cryo-EM structure determination, enabling faster and more efficient processing of large datasets.