TY - GEN
T1 - A Conditional Autoregressive Model for Detecting Natural Selection in Protein-Coding DNA Sequences
AU - Fan, Yu
AU - Wu, Rui
AU - Chen, Ming Hui
AU - Kuo, Lynn
AU - Lewis, Paul O.
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2013
Y1 - 2013
N2 - Phylogenetics, the study of evolutionary relationships among groups of organisms, has played an important role in modern biological research, such as genomic comparison, detecting orthology and paralogy, estimating divergence times, reconstructing ancient proteins, identifying mutations likely to be associated with disease, determining the identity of new pathogens, and finding the residues that are important to natural selection. Given an alignment of protein-coding DNA sequences, most methods for detecting natural selection rely on estimating the codon-specific nonsynonymous/synonymous rate ratios (dN/dS). Here, we describe an approach to modeling variation in the dN/dS by using a conditional autoregressive (CAR) model. The CAR model relaxes the assumption in most contemporary phylogenetic models, i.e., sites in molecular sequences evolve independently. By incorporating the information stored in the Protein Data Bank (PDB) file, the CAR model estimates the dN/dS based on the protein three-dimensional structure. We implement the model in a fully Bayesian approach with all parameters of the model considered as random variables and make use of the NVIDIA's parallel computing architecture (CUDA) to accelerate the calculation. Our result of analyzing an empirical abalone sperm lysine data is in accordance with the previous findings.
AB - Phylogenetics, the study of evolutionary relationships among groups of organisms, has played an important role in modern biological research, such as genomic comparison, detecting orthology and paralogy, estimating divergence times, reconstructing ancient proteins, identifying mutations likely to be associated with disease, determining the identity of new pathogens, and finding the residues that are important to natural selection. Given an alignment of protein-coding DNA sequences, most methods for detecting natural selection rely on estimating the codon-specific nonsynonymous/synonymous rate ratios (dN/dS). Here, we describe an approach to modeling variation in the dN/dS by using a conditional autoregressive (CAR) model. The CAR model relaxes the assumption in most contemporary phylogenetic models, i.e., sites in molecular sequences evolve independently. By incorporating the information stored in the Protein Data Bank (PDB) file, the CAR model estimates the dN/dS based on the protein three-dimensional structure. We implement the model in a fully Bayesian approach with all parameters of the model considered as random variables and make use of the NVIDIA's parallel computing architecture (CUDA) to accelerate the calculation. Our result of analyzing an empirical abalone sperm lysine data is in accordance with the previous findings.
UR - http://www.scopus.com/inward/record.url?scp=84886080775&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84886080775&partnerID=8YFLogxK
U2 - 10.1007/978-1-4614-7846-1_17
DO - 10.1007/978-1-4614-7846-1_17
M3 - Conference contribution
AN - SCOPUS:84886080775
SN - 9781461478454
T3 - Springer Proceedings in Mathematics and Statistics
SP - 203
EP - 212
BT - Topics in Applied Statistics - 2012 Symposium of the International Chinese Statistical Association
T2 - 21st Symposium of the International Chinese Statistical Association, ICSA 2012
Y2 - 23 June 2012 through 26 June 2012
ER -