PlatinumCNV: A Bayesian Gaussian mixture model for genotyping copy number polymorphisms using SNP array signal intensity data

Natsuhiko Kumasaka, Hironori Fujisawa, Naoya Hosono, Yukinori Okada, Atsushi Takahashi, Yusuke Nakamura, Michiaki Kubo, Naoyuki Kamatani

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

We present a statistical model for allele-specific patterns of copy number polymorphisms (CNPs) in commercial single nucleotide polymorphism (SNP) array data. This model is based on the observation that fluorescent signal intensities tend to cluster into clouds of similar allele-specific copy number (ASCN) genotypes at each SNP locus. To capture the tendency of this clustering to be made vague by instrumental errors, our model allows for cluster memberships to overlap each other, according to a Bayesian Gaussian mixture model (GMM). This approach is flexible, allowing for both absolute scale differences and X/Y scale imbalances of fluorescent signal intensities. The resulting model is also robust toward unobserved ASCN genotypes, which can be problematic for ordinary GMMs. We illustrated the utility of the model by applying it to commercial SNP array intensity data obtained from the Illumina HumanHap 610K platform. We retrieved more than 4,000 allele-specific CNPs, though 99% of them showed rather simple allele-specific CNP patterns with only a single aneuploid haplotype among the normal haplotypes. The genotyping accuracy was assessed by two approaches, quantitative PCR and replicated subjects. The results of both of these approaches demonstrated mean genotyping error rates of 1%. We demonstrated a preliminary genome-wide association study of three hematological traits. The result exhibited that it could form the foundation for new, more effective statistical methods for the mapping of both disease genes and quantitative trait loci with genome-wide CNPs. The methods described in this work are implemented in a software package, PlatinumCNV, available on the Internet.

Original languageEnglish
Pages (from-to)831-844
Number of pages14
JournalGenetic Epidemiology
Volume35
Issue number8
DOIs
Publication statusPublished - 01-12-2011

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Fingerprint Dive into the research topics of 'PlatinumCNV: A Bayesian Gaussian mixture model for genotyping copy number polymorphisms using SNP array signal intensity data'. Together they form a unique fingerprint.

  • Cite this

    Kumasaka, N., Fujisawa, H., Hosono, N., Okada, Y., Takahashi, A., Nakamura, Y., Kubo, M., & Kamatani, N. (2011). PlatinumCNV: A Bayesian Gaussian mixture model for genotyping copy number polymorphisms using SNP array signal intensity data. Genetic Epidemiology, 35(8), 831-844. https://doi.org/10.1002/gepi.20633