A practical method to detect SNVs and indels from whole genome and exome sequencing data

Daichi Shigemizu, Akihiro Fujimoto, Shintaro Akiyama, Tetsuo Abe, Kaoru Nakano, Keith A. Boroevich, Yujiro Yamamoto, Mayuko Furuta, Michiaki Kubo, Hidewaki Nakagawa, Tatsuhiko Tsunoda

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.

Original languageEnglish
Article number2161
JournalScientific Reports
Volume3
DOIs
Publication statusPublished - 23-07-2013

Fingerprint

Exome
Nucleotides
Genome
High-Throughput Nucleotide Sequencing
Oligonucleotide Array Sequence Analysis
Technology

All Science Journal Classification (ASJC) codes

  • General

Cite this

Shigemizu, D., Fujimoto, A., Akiyama, S., Abe, T., Nakano, K., Boroevich, K. A., ... Tsunoda, T. (2013). A practical method to detect SNVs and indels from whole genome and exome sequencing data. Scientific Reports, 3, [2161]. https://doi.org/10.1038/srep02161
Shigemizu, Daichi ; Fujimoto, Akihiro ; Akiyama, Shintaro ; Abe, Tetsuo ; Nakano, Kaoru ; Boroevich, Keith A. ; Yamamoto, Yujiro ; Furuta, Mayuko ; Kubo, Michiaki ; Nakagawa, Hidewaki ; Tsunoda, Tatsuhiko. / A practical method to detect SNVs and indels from whole genome and exome sequencing data. In: Scientific Reports. 2013 ; Vol. 3.
@article{ea9a23245f954811b8b7a642a38828de,
title = "A practical method to detect SNVs and indels from whole genome and exome sequencing data",
abstract = "The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98{\%} for WGS calls and 99.99{\%} for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068{\%} and 0.17{\%}) and WES (0.0036{\%} and 0.0084{\%}) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7{\%}, WES: 97.3{\%}). We believe our method can contribute to the greater understanding of human diseases.",
author = "Daichi Shigemizu and Akihiro Fujimoto and Shintaro Akiyama and Tetsuo Abe and Kaoru Nakano and Boroevich, {Keith A.} and Yujiro Yamamoto and Mayuko Furuta and Michiaki Kubo and Hidewaki Nakagawa and Tatsuhiko Tsunoda",
year = "2013",
month = "7",
day = "23",
doi = "10.1038/srep02161",
language = "English",
volume = "3",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",

}

Shigemizu, D, Fujimoto, A, Akiyama, S, Abe, T, Nakano, K, Boroevich, KA, Yamamoto, Y, Furuta, M, Kubo, M, Nakagawa, H & Tsunoda, T 2013, 'A practical method to detect SNVs and indels from whole genome and exome sequencing data', Scientific Reports, vol. 3, 2161. https://doi.org/10.1038/srep02161

A practical method to detect SNVs and indels from whole genome and exome sequencing data. / Shigemizu, Daichi; Fujimoto, Akihiro; Akiyama, Shintaro; Abe, Tetsuo; Nakano, Kaoru; Boroevich, Keith A.; Yamamoto, Yujiro; Furuta, Mayuko; Kubo, Michiaki; Nakagawa, Hidewaki; Tsunoda, Tatsuhiko.

In: Scientific Reports, Vol. 3, 2161, 23.07.2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A practical method to detect SNVs and indels from whole genome and exome sequencing data

AU - Shigemizu, Daichi

AU - Fujimoto, Akihiro

AU - Akiyama, Shintaro

AU - Abe, Tetsuo

AU - Nakano, Kaoru

AU - Boroevich, Keith A.

AU - Yamamoto, Yujiro

AU - Furuta, Mayuko

AU - Kubo, Michiaki

AU - Nakagawa, Hidewaki

AU - Tsunoda, Tatsuhiko

PY - 2013/7/23

Y1 - 2013/7/23

N2 - The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.

AB - The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.

UR - http://www.scopus.com/inward/record.url?scp=84880319487&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880319487&partnerID=8YFLogxK

U2 - 10.1038/srep02161

DO - 10.1038/srep02161

M3 - Article

C2 - 23831772

AN - SCOPUS:84880319487

VL - 3

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

M1 - 2161

ER -

Shigemizu D, Fujimoto A, Akiyama S, Abe T, Nakano K, Boroevich KA et al. A practical method to detect SNVs and indels from whole genome and exome sequencing data. Scientific Reports. 2013 Jul 23;3. 2161. https://doi.org/10.1038/srep02161