TY - JOUR
T1 - Selection of minimum subsets of single nucleotide polymorphisms to capture haplotype block diversity.
AU - Avi-Itzhak, Hadar I.
AU - Su, Xiaoping
AU - De La Vega, Francisco M.
N1 - Copyright:
This record is sourced from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
PY - 2003
Y1 - 2003
N2 - We present a simple numerical algorithm to select the minimal subset of SNPs required to capture the diversity of haplotype blocks or other genetic loci. This algorithm can be used to quickly select the minimum SNP subset with no loss of haplotype information. In addition, the method can be used in a more aggressive mode to further reduce the original SNP set, with minimal loss of information. We demonstrate the algorithm performance with data from over 11,000 SNPs with average spacing of 6 to 11 Kb, across all the genes of chromosomes 6, 21, and 22, genotyped on DNA samples of 45 unrelated African-Americans and 45 Caucasians from the Coriell Human Diversity Collection. With no loss of information, we reduced the number of SNPs required to capture the haplotype block diversity by 25% for the African-American and 36% for the Caucasian populations. With a maximum loss of 10% of haplotype distribution information, the SNP reduction was 38% and 49% respectively for the two populations. All computations were performed in less than 1 minute for the entire dataset used.
AB - We present a simple numerical algorithm to select the minimal subset of SNPs required to capture the diversity of haplotype blocks or other genetic loci. This algorithm can be used to quickly select the minimum SNP subset with no loss of haplotype information. In addition, the method can be used in a more aggressive mode to further reduce the original SNP set, with minimal loss of information. We demonstrate the algorithm performance with data from over 11,000 SNPs with average spacing of 6 to 11 Kb, across all the genes of chromosomes 6, 21, and 22, genotyped on DNA samples of 45 unrelated African-Americans and 45 Caucasians from the Coriell Human Diversity Collection. With no loss of information, we reduced the number of SNPs required to capture the haplotype block diversity by 25% for the African-American and 36% for the Caucasian populations. With a maximum loss of 10% of haplotype distribution information, the SNP reduction was 38% and 49% respectively for the two populations. All computations were performed in less than 1 minute for the entire dataset used.
UR - http://www.scopus.com/inward/record.url?scp=0042128704&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0042128704&partnerID=8YFLogxK
M3 - Article
C2 - 12603050
AN - SCOPUS:0042128704
SN - 2335-6936
SP - 466
EP - 477
JO - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
JF - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
ER -