Open Mass Spectrometry Search Algorithm

Open Mass Spectrometry Search Algorithm

| Lewis Y. Geer*, Sanford P. Markey‡, Jeffrey A. Kowalak‡, Lukas Wagner‡, Ming Xu‡, Dawn M. Maynard‡, Xiaoyu Yang‡, Wenyao Shi‡, Stephen H. Bryant‡
The Open Mass Spectrometry Search Algorithm (OMSSA) is a probability-based algorithm designed for the identification of peptides from MS/MS spectra in proteomics experiments. It uses a classical hypothesis testing approach based on an explicit statistical model, similar to BLAST, to calculate the probability of a match between an observed peptide fragment and those calculated from a sequence search library. OMSSA is designed to be faster than existing algorithms and allows for the setting of thresholds to minimize false positives. OMSSA processes MS/MS spectra by filtering noise, extracting m/z values, and comparing them to calculated m/z values derived from peptides produced by in silico digestion of a protein sequence library. The algorithm then statistically scores the matches. To validate OMSSA, it was compared to Mascot, a commonly used probability-based search algorithm, using standard protein cocktails at different concentrations. OMSSA was found to match more spectra from a standard protein cocktail than Mascot and was faster in searching large datasets. The algorithm includes several steps for noise filtering, precursor charge determination, and mass ladder calculation. It uses a Poisson distribution model to calculate the significance of matches, taking into account the probability of random matches. The E-value is used to rank hits, with lower E-values indicating more significant matches. OMSSA also includes a rescoring step to improve sensitivity by adjusting the noise threshold. OMSSA was validated against Mascot using two different sets of spectra, and the results showed that OMSSA identified more spectra and had better sensitivity and specificity, especially for the 100 fmol dataset. The ROC analysis confirmed that OMSSA is efficient, sensitive, and specific for matching MS/MS peptide spectra. The algorithm is implemented in C++ and is part of the NCBI C++ toolkit, allowing it to run on various operating systems. OMSSA is available for public use and supports multiple file input formats.The Open Mass Spectrometry Search Algorithm (OMSSA) is a probability-based algorithm designed for the identification of peptides from MS/MS spectra in proteomics experiments. It uses a classical hypothesis testing approach based on an explicit statistical model, similar to BLAST, to calculate the probability of a match between an observed peptide fragment and those calculated from a sequence search library. OMSSA is designed to be faster than existing algorithms and allows for the setting of thresholds to minimize false positives. OMSSA processes MS/MS spectra by filtering noise, extracting m/z values, and comparing them to calculated m/z values derived from peptides produced by in silico digestion of a protein sequence library. The algorithm then statistically scores the matches. To validate OMSSA, it was compared to Mascot, a commonly used probability-based search algorithm, using standard protein cocktails at different concentrations. OMSSA was found to match more spectra from a standard protein cocktail than Mascot and was faster in searching large datasets. The algorithm includes several steps for noise filtering, precursor charge determination, and mass ladder calculation. It uses a Poisson distribution model to calculate the significance of matches, taking into account the probability of random matches. The E-value is used to rank hits, with lower E-values indicating more significant matches. OMSSA also includes a rescoring step to improve sensitivity by adjusting the noise threshold. OMSSA was validated against Mascot using two different sets of spectra, and the results showed that OMSSA identified more spectra and had better sensitivity and specificity, especially for the 100 fmol dataset. The ROC analysis confirmed that OMSSA is efficient, sensitive, and specific for matching MS/MS peptide spectra. The algorithm is implemented in C++ and is part of the NCBI C++ toolkit, allowing it to run on various operating systems. OMSSA is available for public use and supports multiple file input formats.
Reach us at info@study.space