TY - JOUR
T1 - PAND
T2 - A distribution to identify functional linkage from networks with preferential attachment property
AU - Li, Hua
AU - Tong, Pan
AU - Gallegos, Juan
AU - Dimmer, Emily
AU - Cai, Guoshuai
AU - Molldrem, Jeffrey J.
AU - Liang, Shoudan
N1 - Funding Information:
We thank Yuan Ji and Zhifeng Shao for thoughtful discussion. We also thank Hualei Kong for polishing figures. This research was funded by a training fellowship from the Keck Center for Quantitative Biomedical Sciences of the Gulf Coast Consortia, on the Computational Cancer Biology Training Program from the Cancer Prevention & Research Institute of Texas (CPRIT No. RP101489). This research was also funded by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry.
Publisher Copyright:
© 2015 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2015/7/9
Y1 - 2015/7/9
N2 - Technology advances have immensely accelerated large-scale mapping of biological networks, which necessitates the development of accurate and powerful network-based algorithms to make functional inferences. A prevailing approach is to leverage functions of neighboring nodes to predict unknown molecular function. However, existing neighborbased algorithms have ignored the scale-free property hidden in many biological networks. By assuming that neighbor sharing is constrained by the preferential attachment property, we developed a Preferential Attachment based common Neighbor Distribution (PAND) to calculate the probability of the neighbor-sharing event between any two nodes in scale-free networks, which nearly perfectly matched the observed probability in simulations. By applying PAND to a human protein-protein interaction (PPI) network, we showed that smaller probabilities represented closer functional linkages between proteins. With the PANDderive linkages, we were able to build new networks where the links are more functionally reliable than those of the human PPI network. We then applied simple annotation schemes to a PAND-derived network to make reliable functional predictions for proteins. We also developed an R package called PANDA (PAND-derived functional Associations) to implement the methods proposed in this study. In conclusion, PAND is a useful distribution to calculate the probability of the neighbor-sharing events in scale-free networks. With PAND, we are able to extract reliable functional linkages from real biological networks and builds new networks that are better bases for further functional inference.
AB - Technology advances have immensely accelerated large-scale mapping of biological networks, which necessitates the development of accurate and powerful network-based algorithms to make functional inferences. A prevailing approach is to leverage functions of neighboring nodes to predict unknown molecular function. However, existing neighborbased algorithms have ignored the scale-free property hidden in many biological networks. By assuming that neighbor sharing is constrained by the preferential attachment property, we developed a Preferential Attachment based common Neighbor Distribution (PAND) to calculate the probability of the neighbor-sharing event between any two nodes in scale-free networks, which nearly perfectly matched the observed probability in simulations. By applying PAND to a human protein-protein interaction (PPI) network, we showed that smaller probabilities represented closer functional linkages between proteins. With the PANDderive linkages, we were able to build new networks where the links are more functionally reliable than those of the human PPI network. We then applied simple annotation schemes to a PAND-derived network to make reliable functional predictions for proteins. We also developed an R package called PANDA (PAND-derived functional Associations) to implement the methods proposed in this study. In conclusion, PAND is a useful distribution to calculate the probability of the neighbor-sharing events in scale-free networks. With PAND, we are able to extract reliable functional linkages from real biological networks and builds new networks that are better bases for further functional inference.
UR - http://www.scopus.com/inward/record.url?scp=84941354445&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84941354445&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0127968
DO - 10.1371/journal.pone.0127968
M3 - Article
C2 - 26158709
AN - SCOPUS:84941354445
SN - 1932-6203
VL - 10
JO - PloS one
JF - PloS one
IS - 7
M1 - e0127968
ER -