MS-GF + makes progress towards a universal database search tool for proteomics

MS-GF + makes progress towards a universal database search tool for proteomics

31 Oct 2014 | Sangtae Kim & Pavel A. Pevzner
MS-GF+ is a universal database search tool for proteomics that outperforms existing tools in sensitivity and versatility. It is designed to handle diverse types of tandem mass spectra from various instruments and experimental protocols. The tool uses a probabilistic model to automatically derive scoring parameters from thousands of peptide-spectrum matches (PSMs), making it adaptable to different spectral data sets without prior knowledge of the type of spectra. MS-GF+ significantly increases the number of identified peptides compared to commonly used methods, and it is not specifically designed for any particular experimental setup, but rather improves on the performance of tools specifically designed for these applications. MS-GF+ works well for spectra generated using diverse configurations of MS instruments and experimental protocols. It uses a simple dot-product scoring model, which contrasts with many other database search and rescoring tools that use sophisticated scoring functions. The tool computes rigorous E-values using the generating function approach, which allows for more accurate identification of peptides. MS-GF+ is also able to handle high-precision product ion peaks, which is crucial for accurate peptide identification. The tool was benchmarked against popular tools such as Mascot + Percolator, SEQUEST, and OMSSA using diverse spectral data sets, including those from human, yeast, mouse, and Schizosaccharomyces pombe. MS-GF+ identified significantly more PSMs than these tools, especially for data sets with unusual fragmentation propensities. It also performed well for phosphopeptides and ubiquitinated peptides, demonstrating its ability to handle different types of post-translational modifications. MS-GF+ is also effective for identifying peptides produced by a new protease, α-LP, which has cleavage specificities somewhat orthogonal to trypsin. The tool was able to identify a large number of PSMs from these data sets, even when the search space was large and no enzyme was specified. The performance of MS-GF+ was further improved when specific scoring parameters for α-LP were used. The running time of MS-GF+ is similar to that of Mascot + Percolator for LL, HL, and HH spectra. The tool is freely available and can be integrated with various other proteomics analysis tools. Overall, MS-GF+ represents a significant advancement in the field of proteomics, offering a universal database search tool that is sensitive, versatile, and effective for a wide range of spectral data sets.MS-GF+ is a universal database search tool for proteomics that outperforms existing tools in sensitivity and versatility. It is designed to handle diverse types of tandem mass spectra from various instruments and experimental protocols. The tool uses a probabilistic model to automatically derive scoring parameters from thousands of peptide-spectrum matches (PSMs), making it adaptable to different spectral data sets without prior knowledge of the type of spectra. MS-GF+ significantly increases the number of identified peptides compared to commonly used methods, and it is not specifically designed for any particular experimental setup, but rather improves on the performance of tools specifically designed for these applications. MS-GF+ works well for spectra generated using diverse configurations of MS instruments and experimental protocols. It uses a simple dot-product scoring model, which contrasts with many other database search and rescoring tools that use sophisticated scoring functions. The tool computes rigorous E-values using the generating function approach, which allows for more accurate identification of peptides. MS-GF+ is also able to handle high-precision product ion peaks, which is crucial for accurate peptide identification. The tool was benchmarked against popular tools such as Mascot + Percolator, SEQUEST, and OMSSA using diverse spectral data sets, including those from human, yeast, mouse, and Schizosaccharomyces pombe. MS-GF+ identified significantly more PSMs than these tools, especially for data sets with unusual fragmentation propensities. It also performed well for phosphopeptides and ubiquitinated peptides, demonstrating its ability to handle different types of post-translational modifications. MS-GF+ is also effective for identifying peptides produced by a new protease, α-LP, which has cleavage specificities somewhat orthogonal to trypsin. The tool was able to identify a large number of PSMs from these data sets, even when the search space was large and no enzyme was specified. The performance of MS-GF+ was further improved when specific scoring parameters for α-LP were used. The running time of MS-GF+ is similar to that of Mascot + Percolator for LL, HL, and HH spectra. The tool is freely available and can be integrated with various other proteomics analysis tools. Overall, MS-GF+ represents a significant advancement in the field of proteomics, offering a universal database search tool that is sensitive, versatile, and effective for a wide range of spectral data sets.
Reach us at info@study.space