Submit a preprint

309

Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteinsuse asterix (*) to get italics
Ismael Rodriguez-Palomo, Bharath Nair, Yun Chiang, Joannes Dekker, Benjamin Dartigues, Meaghan Mackie, Miranda Evans, Ruairidh Macleod, Jesper V. Olsen, Matthew J. CollinsPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2024
<p style="text-align: justify;">Palaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack of a systematic exploration of how these affect the identification of peptides, post-translational modifications (PTMs), proteins and their significance (through the False Discovery Rate) and correctness. We systematically investigated the performance of a wide range of sequencing tools and search engines in a controlled system: the experimental degradation of the single purified bovine beta-lactoglobulin (BLG), heated at 95 C and pH 7 for 0, 4 and 128 days. We target BLG since it is one of the most robust and ubiquitous proteins in the archaeological record. We tested different reference database choices, a targeted dairy protein one, and the whole bovine proteome and the three digestion options (tryptic-, semi-tryptic- and non-specific searches), in order to evaluate the effects of search space and the identification of peptides. We also explored alternative strategies, including open search that allows for the global identification of PTMs based upon wide precursor mass tolerance and de novo sequencing to boost sequence coverage. We analysed the samples using Mascot, MaxQuant, Metamorpheus, pFind, Fragpipe and DeNovoGUI (pepNovo+, DirecTag, Novor), benchmarked these tools and discuss the optimal strategy for the characterisation of ancient proteins. We also studied physicochemical properties of the BLG that correlate with bias in the identification coverage.</p>
https://doi.org/10.5281/zenodo.13785293You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.13785336You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Palaeoproteomics; beta-lactoglobulin; False Discovery Rate; benchmarking; de novo; open search
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Genomics and Transcriptomics, Probability and statistics
Timothy Cleland, clelandtp@si.edu, Roman Fisher, roman.fischer@ndm.ox.ac.uk, Adam Dowle, adam.dowle@york.ac.uk, William Stafford Noble, william-noble@uw.edu, wnoble@uw.edu, Kyowon Jeong, kyowon.jeong@uni-tuebingen.de, Adam Dowle [adam.dowle@york.ac.uk] suggested: chloe.baldreki@york.ac.uk; jessica.hendy@york.ac.uk, Samantha Preslee suggested: Camilla Speller - cspeller@mail.ubc.ca, Samantha Preslee suggested: Kristine K. Richter - krichter@palaeome.org, Samantha Preslee suggested: Carli Peters - peters@shh.mpg.de, Carli Peters suggested: Shevan Wilkin, email: shevan.wilkin@iem.uzh.ch, L Smith suggested: JOSHUA J COON <jcoon@chem.wisc.edu>, Shevan Wilkin suggested: Few comments, overall pretty good paper. Would be interested to see the comparison of Mascot and MaxQuant recovery in more detail as those seems to be the most commonly used in ancient protein papers. , Shevan Wilkin suggested: Figure 3 is referred to as Venn diagrams in the text, but these are in Figure 4. Also, in the text Figure 5 is referred to as Figure 4.
e.g. John Doe john@doe.com
No need for them to be recommenders of PCI Math Comp Biol. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe john@doe.com
2024-03-12 15:17:08
Raquel Assis
Anonymous, Shevan Wilkin