Roary: rapid large-scale prokaryote pan genome analysis

Roary: rapid large-scale prokaryote pan genome analysis

2015 | Andrew J. Page, Carla A. Cummins, Martin Hunt, Vanessa K. Wong, Sandra Reuter, Matthew T.G. Holden, Maria Fookes, Daniel Falush, Jacqueline A. Keane and Julian Parkhill
Roary is a tool for rapid large-scale prokaryote pan genome analysis. It allows the construction of pan genomes for thousands of prokaryotic samples on a standard desktop computer without compromising accuracy. Roary uses a combination of clustering and BLAST to identify core and accessory genes. It is implemented in Perl and is freely available under an open source GPLv3 license. The tool is designed to handle large datasets efficiently, with a single CPU able to process 1000 isolates in 4.5 hours using 13 GB of RAM. Roary is compared to other pan genome tools such as PanOCT, LS-BSR, and PGAP, and is found to be more efficient in terms of both running time and memory usage. It is able to accurately identify all clusters on simulated data, and performs well on real data including Salmonella enterica serovar Typhi. Roary is also efficient in multi-processor environments, achieving a speedup of 3.7X using 8 CPUs and GNU Parallel. The tool is able to handle large real datasets, identifying a large number of core genes even in the presence of a varied open pan genome. Roary is a valuable tool for the analysis of prokaryotic genomes, providing a fast and accurate method for constructing pan genomes.Roary is a tool for rapid large-scale prokaryote pan genome analysis. It allows the construction of pan genomes for thousands of prokaryotic samples on a standard desktop computer without compromising accuracy. Roary uses a combination of clustering and BLAST to identify core and accessory genes. It is implemented in Perl and is freely available under an open source GPLv3 license. The tool is designed to handle large datasets efficiently, with a single CPU able to process 1000 isolates in 4.5 hours using 13 GB of RAM. Roary is compared to other pan genome tools such as PanOCT, LS-BSR, and PGAP, and is found to be more efficient in terms of both running time and memory usage. It is able to accurately identify all clusters on simulated data, and performs well on real data including Salmonella enterica serovar Typhi. Roary is also efficient in multi-processor environments, achieving a speedup of 3.7X using 8 CPUs and GNU Parallel. The tool is able to handle large real datasets, identifying a large number of core genes even in the presence of a varied open pan genome. Roary is a valuable tool for the analysis of prokaryotic genomes, providing a fast and accurate method for constructing pan genomes.
Reach us at info@study.space