LUMPY: A probabilistic framework for structural variant discovery

LUMPY: A probabilistic framework for structural variant discovery

| Ryan M Layer, Aaron R Quinlan*, Ira M Hall*
LUMPY is a probabilistic framework for structural variant (SV) discovery that integrates multiple read alignment signals, including read-pair, split-read, and read-depth, across multiple samples. It maps each SV signal to a common abstract representation in the form of breakpoint probability distributions, enabling joint integration of signals. This approach allows for simple and natural signal integration, produces a probabilistic measure of breakpoint position, and can be extended to new signals as sequencing technologies evolve. LUMPY demonstrates improved sensitivity over existing methods, especially in low coverage or heterogeneous tumor samples. It integrates read-pair and split-read signals to enhance sensitivity, and performs well in both simulated and real data. LUMPY's performance is evaluated against GASVPro, DELLY, and PINDEL, showing superior sensitivity and lower false discovery rate (FDR) in most cases. LUMPY's ability to integrate multiple signals and use prior knowledge improves detection of low-frequency variants in heterogeneous tumors. It also provides probabilistic breakpoint intervals, allowing for comparison across studies. LUMPY's framework is flexible, allowing for the inclusion of new evidence types, and is implemented as an open-source C++ package. The framework's probabilistic approach enables accurate prediction of breakpoint positions and is effective in both simulated and real data. LUMPY's performance is further enhanced by incorporating prior knowledge of known variants and parental genomes, demonstrating its utility in cancer genomics and other applications requiring detection of low-frequency variants.LUMPY is a probabilistic framework for structural variant (SV) discovery that integrates multiple read alignment signals, including read-pair, split-read, and read-depth, across multiple samples. It maps each SV signal to a common abstract representation in the form of breakpoint probability distributions, enabling joint integration of signals. This approach allows for simple and natural signal integration, produces a probabilistic measure of breakpoint position, and can be extended to new signals as sequencing technologies evolve. LUMPY demonstrates improved sensitivity over existing methods, especially in low coverage or heterogeneous tumor samples. It integrates read-pair and split-read signals to enhance sensitivity, and performs well in both simulated and real data. LUMPY's performance is evaluated against GASVPro, DELLY, and PINDEL, showing superior sensitivity and lower false discovery rate (FDR) in most cases. LUMPY's ability to integrate multiple signals and use prior knowledge improves detection of low-frequency variants in heterogeneous tumors. It also provides probabilistic breakpoint intervals, allowing for comparison across studies. LUMPY's framework is flexible, allowing for the inclusion of new evidence types, and is implemented as an open-source C++ package. The framework's probabilistic approach enables accurate prediction of breakpoint positions and is effective in both simulated and real data. LUMPY's performance is further enhanced by incorporating prior knowledge of known variants and parental genomes, demonstrating its utility in cancer genomics and other applications requiring detection of low-frequency variants.
Reach us at info@study.space