Correcting for the study bias associated with protein–protein interaction measurements reveals differences between protein degree distributions from different cancer types

Schaefer, Martin H; Serrano, Luis; Andrade-Navarro, Miguel A 2015. Frontiers in genetics,6,260.


Protein-protein interaction (PPI) networks are associated with multiple types of biases partly rooted in technical limitations of the experimental techniques. Another source of bias are the different frequencies with which proteins have been studied for interaction partners. It is generally believed that proteins with a large number of interaction partners tend to be essential, evolutionarily conserved and involved in disease. It has been repeatedly reported that proteins driving tumor formation have a higher number of PPI partners. However, it has been noticed before that the degree distribution of PPI networks is biased towards disease proteins, which tend to have been studied more often than non-disease proteins. It is unclear to which extent this study bias affects the observation that cancer proteins tend to have more PPI partners. Here, we show that the degree of a protein is a function of the number of times it has been screened for interaction partners. We present a randomization-based method that controls for this bias to decide whether a group of proteins is associated with significantly more PPI partners than the proteomic background. We apply our method to cancer proteins and observe, in contrast to previous studies, no conclusive evidence for a significantly higher degree distribution associated with cancer proteins as compared to equally often studied non-cancer proteins. Comparing proteins from different tumor types, a more complex picture emerges in which proteins of certain cancer classes (e.g. hematological cancers) have significantly more interaction partners while others (solid tumors) are associated with a smaller degree. We discuss the biological implications of these findings. Our work shows that accounting for biases in the PPI network is possible and increases the value of PPI data.