TY - JOUR
T1 - Making a haplotype catalog with estimated frequencies based on SNP homozygotes
AU - Yamaguchi-Kabata, Yumi
AU - Tsunoda, Tatsuhiko
AU - Takahashi, Atsushi
AU - Hosono, Naoya
AU - Kubo, Michiaki
AU - Nakamura, Yusuke
AU - Kamatani, Naoyuki
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2010/8
Y1 - 2010/8
N2 - Understanding the structure and frequencies of haplotypes is important for associating genetic polymorphisms with a given trait and for inferring the genetic genealogy of alleles in a population. Single nucleotide polymorphism (SNP) haplotypes can be determined without ambiguity when an individual does not have more than one heterozygous site in a given genomic region. Using genome-wide SNP genotypes for 3397 individuals from the Japanese population, we detected SNP homozygotes in the genomic regions of 1955 genes, determined haplotypes, and examined the efficiency of haplotype frequency estimation based on the proportion of SNP homozygotes in the sample. The estimated haplotype frequencies were very similar to the frequencies obtained by two statistical methods, PHASE and SNPHAP. We applied this approach to the genomic regions of 11 351 genes, and the results suggested that the sum of the frequencies of unobserved haplotypes is negligible for an analysis of a 100 kb genomic region with 20 SNPs. Determination of haplotypes from homozygotes using genotype data from thousands of individuals, without a long computation time, appears to be useful for detecting real haplotypes including some low-frequency haplotypes. In addition, the unambiguously determined haplotypes with their estimated frequencies can be used as a catalog of haplotypes for the population, which is useful for the design of genome-wide association studies.
AB - Understanding the structure and frequencies of haplotypes is important for associating genetic polymorphisms with a given trait and for inferring the genetic genealogy of alleles in a population. Single nucleotide polymorphism (SNP) haplotypes can be determined without ambiguity when an individual does not have more than one heterozygous site in a given genomic region. Using genome-wide SNP genotypes for 3397 individuals from the Japanese population, we detected SNP homozygotes in the genomic regions of 1955 genes, determined haplotypes, and examined the efficiency of haplotype frequency estimation based on the proportion of SNP homozygotes in the sample. The estimated haplotype frequencies were very similar to the frequencies obtained by two statistical methods, PHASE and SNPHAP. We applied this approach to the genomic regions of 11 351 genes, and the results suggested that the sum of the frequencies of unobserved haplotypes is negligible for an analysis of a 100 kb genomic region with 20 SNPs. Determination of haplotypes from homozygotes using genotype data from thousands of individuals, without a long computation time, appears to be useful for detecting real haplotypes including some low-frequency haplotypes. In addition, the unambiguously determined haplotypes with their estimated frequencies can be used as a catalog of haplotypes for the population, which is useful for the design of genome-wide association studies.
UR - http://www.scopus.com/inward/record.url?scp=77957569456&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77957569456&partnerID=8YFLogxK
U2 - 10.1038/jhg.2010.56
DO - 10.1038/jhg.2010.56
M3 - Article
C2 - 20485442
AN - SCOPUS:77957569456
SN - 1434-5161
VL - 55
SP - 500
EP - 506
JO - Journal of Human Genetics
JF - Journal of Human Genetics
IS - 8
ER -