A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework

Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium

Research output: Contribution to journalArticle

Abstract

Background: There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr). Results: We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power. Conclusions: With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.

Original languageEnglish
Article number494
JournalBMC Genomics
Volume19
Issue number1
DOIs
Publication statusPublished - 25-06-2018

Fingerprint

Genome-Wide Association Study

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics

Cite this

Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium (2018). A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework. BMC Genomics, 19(1), [494]. https://doi.org/10.1186/s12864-018-4859-7
Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium. / A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework. In: BMC Genomics. 2018 ; Vol. 19, No. 1.
@article{e5f72d51bb014bff883b0c687f1713ba,
title = "A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework",
abstract = "Background: There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr). Results: We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power. Conclusions: With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.",
author = "{Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium} and Marissa LeBlanc and Verena Zuber and Thompson, {Wesley K.} and Andreassen, {Ole A.} and Arnoldo Frigessi and Andreassen, {Bettina Kulle} and Stephan Ripke and Neale, {Benjamin M.} and Aiden Corvin and Walters, {James T.R.} and Farh, {Kai How} and Phil Lee and Brendan Bulik-Sullivan and Collier, {David A.} and Hailiang Huang and Pers, {Tune H.} and Ingrid Agartz and Esben Agerbo and Margot Albus and Madeline Alexander and Farooq Amin and Bacanu, {Silviu A.} and Martin Begemann and Belliveau, {Richard A.} and Judit Bene and Elizabeth Bevilacqua and Bigdeli, {Tim B.} and Black, {Donald W.} and Richard Bruggeman and Buccola, {Nancy G.} and Buckner, {Randy L.} and Wiepke Cahn and Guiqing Cai and Cairns, {Murray J.} and Dominique Campion and Cantor, {Rita M.} and Carr, {Vaughan J.} and Noa Carrera and Catts, {Stanley V.} and Chambert, {Kimberly D.} and Chan, {Raymond C.K.} and Chen, {Ronald Y.L.} and Chen, {Eric Y.H.} and Wei Cheng and Cheung, {Eric F.C.} and Chong, {Siow Ann} and Cloninger, {C. Robert} and David Cohen and Masashi Ikeda and Nakao Iwata",
year = "2018",
month = "6",
day = "25",
doi = "10.1186/s12864-018-4859-7",
language = "English",
volume = "19",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium 2018, 'A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework' BMC Genomics, vol. 19, no. 1, 494. https://doi.org/10.1186/s12864-018-4859-7

A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework. / Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium.

In: BMC Genomics, Vol. 19, No. 1, 494, 25.06.2018.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework

AU - Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium

AU - LeBlanc, Marissa

AU - Zuber, Verena

AU - Thompson, Wesley K.

AU - Andreassen, Ole A.

AU - Frigessi, Arnoldo

AU - Andreassen, Bettina Kulle

AU - Ripke, Stephan

AU - Neale, Benjamin M.

AU - Corvin, Aiden

AU - Walters, James T.R.

AU - Farh, Kai How

AU - Lee, Phil

AU - Bulik-Sullivan, Brendan

AU - Collier, David A.

AU - Huang, Hailiang

AU - Pers, Tune H.

AU - Agartz, Ingrid

AU - Agerbo, Esben

AU - Albus, Margot

AU - Alexander, Madeline

AU - Amin, Farooq

AU - Bacanu, Silviu A.

AU - Begemann, Martin

AU - Belliveau, Richard A.

AU - Bene, Judit

AU - Bevilacqua, Elizabeth

AU - Bigdeli, Tim B.

AU - Black, Donald W.

AU - Bruggeman, Richard

AU - Buccola, Nancy G.

AU - Buckner, Randy L.

AU - Cahn, Wiepke

AU - Cai, Guiqing

AU - Cairns, Murray J.

AU - Campion, Dominique

AU - Cantor, Rita M.

AU - Carr, Vaughan J.

AU - Carrera, Noa

AU - Catts, Stanley V.

AU - Chambert, Kimberly D.

AU - Chan, Raymond C.K.

AU - Chen, Ronald Y.L.

AU - Chen, Eric Y.H.

AU - Cheng, Wei

AU - Cheung, Eric F.C.

AU - Chong, Siow Ann

AU - Cloninger, C. Robert

AU - Cohen, David

AU - Ikeda, Masashi

AU - Iwata, Nakao

PY - 2018/6/25

Y1 - 2018/6/25

N2 - Background: There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr). Results: We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power. Conclusions: With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.

AB - Background: There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr). Results: We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power. Conclusions: With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.

UR - http://www.scopus.com/inward/record.url?scp=85049066693&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049066693&partnerID=8YFLogxK

U2 - 10.1186/s12864-018-4859-7

DO - 10.1186/s12864-018-4859-7

M3 - Article

VL - 19

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 494

ER -

Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium. A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework. BMC Genomics. 2018 Jun 25;19(1). 494. https://doi.org/10.1186/s12864-018-4859-7