TY - JOUR
T1 - Predicting the number of sulfur atoms in peptides and small proteins based on the observed aggregated isotope distribution
AU - Claesen, Jürgen
AU - Valkenborg, Dirk
AU - Burzykowski, Tomasz
N1 - Publisher Copyright: © 2021 The Authors. Rapid Communications in Mass Spectrometry published by John Wiley & Sons Ltd. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/10/15
Y1 - 2021/10/15
N2 - Rationale: Identification of peptides and proteins is a challenging task in mass spectrometry–based proteomics. Knowledge of the number of sulfur atoms can improve the identification of peptides and proteins. Methods: In this article, we propose a method for the prediction of S-atoms based on the aggregated isotope distribution. The Mahalanobis distance is used as dissimilarity measure to compare mass- and intensity-based features from the observed and theoretical isotope distributions. Results: The relative abundance of the second and the third aggregated isotopic variants (as compared to the monoisotopic one) and the mass difference between the second and third aggregated isotopic variants are the most important features to predict the number of S-atoms. Conclusions: The mass and intensity accuracies of the observed aggregated isotopic variants are insufficient to accurately predict the number of atoms. However, using a limited set of predictions for a peptide, rather than predicting a single number of S-atoms, has a reasonably high prediction accuracy.
AB - Rationale: Identification of peptides and proteins is a challenging task in mass spectrometry–based proteomics. Knowledge of the number of sulfur atoms can improve the identification of peptides and proteins. Methods: In this article, we propose a method for the prediction of S-atoms based on the aggregated isotope distribution. The Mahalanobis distance is used as dissimilarity measure to compare mass- and intensity-based features from the observed and theoretical isotope distributions. Results: The relative abundance of the second and the third aggregated isotopic variants (as compared to the monoisotopic one) and the mass difference between the second and third aggregated isotopic variants are the most important features to predict the number of S-atoms. Conclusions: The mass and intensity accuracies of the observed aggregated isotopic variants are insufficient to accurately predict the number of atoms. However, using a limited set of predictions for a peptide, rather than predicting a single number of S-atoms, has a reasonably high prediction accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85114424673&partnerID=8YFLogxK
U2 - https://doi.org/10.1002/rcm.9162
DO - https://doi.org/10.1002/rcm.9162
M3 - Article
C2 - 34240492
SN - 0951-4198
VL - 35
JO - RCM. Rapid Communications in Mass Spectrometry
JF - RCM. Rapid Communications in Mass Spectrometry
IS - 19
M1 - e9162
ER -