Versatile genome assembly evaluation with QUASt-LG

Versatile genome assembly evaluation with QUASt-LG

2018 | Alla Mikheenko, Andrey Prjibelski, Vladislav Saveliev, Dmitry Antipov and Alexey Gurevich
QUAST-LG is a new tool for evaluating genome assemblies, particularly for large eukaryotic genomes. It improves upon the original QUAST by allowing the analysis of large genomic sequences and introduces new quality metrics. The tool compares assemblies against reference genomes and computes metrics such as completeness, correctness, and contiguity. It also introduces the concept of upper bound assembly, which estimates the theoretical limits of assembly correctness and completeness based on the genome's complexity and read coverage. QUAST-LG can evaluate mammalian-sized assemblies in a few hours on a regular server. It includes speed improvements and new metrics that account for eukaryotic genome features like transposon abundance. Unlike reference-free tools, QUAST-LG incorporates reference-free analysis as part of its pipeline but is not primarily designed for this purpose. It also includes a method to detect discrepancies caused by transposable elements (TEs) and uses k-mer-based statistics to assess assembly quality. QUAST-LG is compared to conventional QUAST and shows improved performance, especially on large genomes. It is also used to evaluate the performance of various genome assembly tools on six eukaryotic datasets, including yeast, worm, fruit fly, and human genomes. The results show that no single assembler performs best on all metrics, and the best assemblies vary depending on the dataset. QUAST-LG provides a comprehensive evaluation of genome assemblies, including metrics such as N50, NGA50, and BUSCO completeness. It is a valuable tool for assessing the quality of genome assemblies, particularly for large eukaryotic genomes.QUAST-LG is a new tool for evaluating genome assemblies, particularly for large eukaryotic genomes. It improves upon the original QUAST by allowing the analysis of large genomic sequences and introduces new quality metrics. The tool compares assemblies against reference genomes and computes metrics such as completeness, correctness, and contiguity. It also introduces the concept of upper bound assembly, which estimates the theoretical limits of assembly correctness and completeness based on the genome's complexity and read coverage. QUAST-LG can evaluate mammalian-sized assemblies in a few hours on a regular server. It includes speed improvements and new metrics that account for eukaryotic genome features like transposon abundance. Unlike reference-free tools, QUAST-LG incorporates reference-free analysis as part of its pipeline but is not primarily designed for this purpose. It also includes a method to detect discrepancies caused by transposable elements (TEs) and uses k-mer-based statistics to assess assembly quality. QUAST-LG is compared to conventional QUAST and shows improved performance, especially on large genomes. It is also used to evaluate the performance of various genome assembly tools on six eukaryotic datasets, including yeast, worm, fruit fly, and human genomes. The results show that no single assembler performs best on all metrics, and the best assemblies vary depending on the dataset. QUAST-LG provides a comprehensive evaluation of genome assemblies, including metrics such as N50, NGA50, and BUSCO completeness. It is a valuable tool for assessing the quality of genome assemblies, particularly for large eukaryotic genomes.
Reach us at info@study.space
[slides and audio] Versatile genome assembly evaluation with QUAST-LG