A statistical framework to discover true associations from multiprotein complex pull-down proteomics data sets

Academic Article


  • Experimental processes to collect and process proteomics data are increasingly complex, and the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed an empirical Bayes model to analyze multiprotein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Using our model and two yeast proieomics data sets, we estimated that there should be an average of about 20 true associations per MPC, almost 10 times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%. Cross-examination of our results with protein complexes confirmed by various experimental techniques demonstrates that many true associations that cannot be identified by previous approach are identified by our method. © 2006 Wiley-Liss, Inc.
  • Authors

    Digital Object Identifier (doi)

    Author List

  • Shen C; Li L; Chen JY
  • Start Page

  • 436
  • End Page

  • 443
  • Volume

  • 64
  • Issue

  • 2