Selection of minimum subsets of single nucleotide polymorphisms to capture haplotype block diversity.

Hadar I. Avi-Itzhak, Xiaoping Su, Francisco M. De La Vega

Research output: Contribution to journalArticlepeer-review

49 Scopus citations

Abstract

We present a simple numerical algorithm to select the minimal subset of SNPs required to capture the diversity of haplotype blocks or other genetic loci. This algorithm can be used to quickly select the minimum SNP subset with no loss of haplotype information. In addition, the method can be used in a more aggressive mode to further reduce the original SNP set, with minimal loss of information. We demonstrate the algorithm performance with data from over 11,000 SNPs with average spacing of 6 to 11 Kb, across all the genes of chromosomes 6, 21, and 22, genotyped on DNA samples of 45 unrelated African-Americans and 45 Caucasians from the Coriell Human Diversity Collection. With no loss of information, we reduced the number of SNPs required to capture the haplotype block diversity by 25% for the African-American and 36% for the Caucasian populations. With a maximum loss of 10% of haplotype distribution information, the SNP reduction was 38% and 49% respectively for the two populations. All computations were performed in less than 1 minute for the entire dataset used.

Original languageEnglish (US)
Pages (from-to)466-477
Number of pages12
JournalPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
StatePublished - 2003
Externally publishedYes

ASJC Scopus subject areas

  • General Medicine

Fingerprint

Dive into the research topics of 'Selection of minimum subsets of single nucleotide polymorphisms to capture haplotype block diversity.'. Together they form a unique fingerprint.

Cite this