ROCS: a reproducibility index and confidence score for interaction proteomics
ROCS: a reproducibility index and confidence score for interaction proteomics
Affinity-Purification Mass-Spectrometry (AP-MS) provides a powerful means of identifying protein complexes and interactions. Several important challenges exist in interpreting the results of AP-MS experiments. First, the reproducibility of AP-MS experimental replicates can be low, due both to technical variability and the dynamic nature of protein interactions in the cell. Second, the identification of true protein-protein interactions in AP-MS experiments is subject to inaccuracy due to high false negative and false positive rates. Several experimental approaches can be used to mitigate these drawbacks, including the use of replicated and control experiments and relative quantification to sensitively distinguish true interacting proteins from false ones.
RESULTS: To address the issues of reproducibility and accuracy of protein-protein interactions, we introduce a two-step method, called ROCS, which makes use of Indicator Proteins to select reproducible AP-MS experiments, and of Confidence Scores to select specific protein-protein interactions. The Indicator Proteins account for measures of protein identification as well as protein reproducibility, effectively allowing removal of outlier experiments that contribute noise and affect downstream inferences. The filtered set of experiments is then used in the Protein-Protein Interaction (PPI) scoring step. Prey protein scoring is done by computing a Confidence Score, which accounts for the probability of occurrence of prey proteins in the bait experiments relative to the control experiment, where the significance cutoff parameter is estimated by simultaneously controlling false positives and false negatives against metrics of false discovery rate and biological coherence respectively. In summary, the ROCS method relies on automatic objective criterions for parameter estimation and error-controlled procedures. We illustrate the performance of our method by applying it to five previously published AP-MS experiments, each containing well characterized protein interactions, allowing for systematic benchmarking of ROCS. We show that our method may be used on its own to make accurate identification of specific, biologically relevant protein-protein interactions or in combination with other AP-MS scoring methods to significantly improve inferences.
CONCLUSIONS: Our method addresses important issues encountered in AP-MS datasets, making ROCS a very promising tool for this purpose, either on its own or especially in conjunction with other methods. We anticipate that our methodology may be used more generally in proteomics studies and databases, where experimental reproducibility issues arise. The method is implemented in the R language, and is available as an R package called "ROCS", freely available from the CRAN repository http://cran.r-project.org/.
128
Dazard, Jean-Eudes J.
bc9374ee-6dcb-400b-895c-b9c4c2fdbee3
Saha, Sudipto
77b0a09f-c013-4418-b3b5-659cca46fe9d
Ewing, Rob M.
022c5b04-da20-4e55-8088-44d0dc9935ae
8 June 2012
Dazard, Jean-Eudes J.
bc9374ee-6dcb-400b-895c-b9c4c2fdbee3
Saha, Sudipto
77b0a09f-c013-4418-b3b5-659cca46fe9d
Ewing, Rob M.
022c5b04-da20-4e55-8088-44d0dc9935ae
Dazard, Jean-Eudes J., Saha, Sudipto and Ewing, Rob M.
(2012)
ROCS: a reproducibility index and confidence score for interaction proteomics.
BMC Bioinformatics, 13 (1), .
(doi:10.1186/1471-2105-13-128).
(PMID:22682516)
Abstract
Affinity-Purification Mass-Spectrometry (AP-MS) provides a powerful means of identifying protein complexes and interactions. Several important challenges exist in interpreting the results of AP-MS experiments. First, the reproducibility of AP-MS experimental replicates can be low, due both to technical variability and the dynamic nature of protein interactions in the cell. Second, the identification of true protein-protein interactions in AP-MS experiments is subject to inaccuracy due to high false negative and false positive rates. Several experimental approaches can be used to mitigate these drawbacks, including the use of replicated and control experiments and relative quantification to sensitively distinguish true interacting proteins from false ones.
RESULTS: To address the issues of reproducibility and accuracy of protein-protein interactions, we introduce a two-step method, called ROCS, which makes use of Indicator Proteins to select reproducible AP-MS experiments, and of Confidence Scores to select specific protein-protein interactions. The Indicator Proteins account for measures of protein identification as well as protein reproducibility, effectively allowing removal of outlier experiments that contribute noise and affect downstream inferences. The filtered set of experiments is then used in the Protein-Protein Interaction (PPI) scoring step. Prey protein scoring is done by computing a Confidence Score, which accounts for the probability of occurrence of prey proteins in the bait experiments relative to the control experiment, where the significance cutoff parameter is estimated by simultaneously controlling false positives and false negatives against metrics of false discovery rate and biological coherence respectively. In summary, the ROCS method relies on automatic objective criterions for parameter estimation and error-controlled procedures. We illustrate the performance of our method by applying it to five previously published AP-MS experiments, each containing well characterized protein interactions, allowing for systematic benchmarking of ROCS. We show that our method may be used on its own to make accurate identification of specific, biologically relevant protein-protein interactions or in combination with other AP-MS scoring methods to significantly improve inferences.
CONCLUSIONS: Our method addresses important issues encountered in AP-MS datasets, making ROCS a very promising tool for this purpose, either on its own or especially in conjunction with other methods. We anticipate that our methodology may be used more generally in proteomics studies and databases, where experimental reproducibility issues arise. The method is implemented in the R language, and is available as an R package called "ROCS", freely available from the CRAN repository http://cran.r-project.org/.
Text
1471-2105-13-128.pdf
- Version of Record
Available under License Other.
More information
Published date: 8 June 2012
Organisations:
Faculty of Natural and Environmental Sciences, Centre for Biological Sciences
Identifiers
Local EPrints ID: 345205
URI: http://eprints.soton.ac.uk/id/eprint/345205
ISSN: 1471-2105
PURE UUID: 2e85918a-c1c3-4e51-973d-0503d80dbd54
Catalogue record
Date deposited: 13 Nov 2012 12:21
Last modified: 15 Mar 2024 03:44
Export record
Altmetrics
Contributors
Author:
Jean-Eudes J. Dazard
Author:
Sudipto Saha
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics