Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies

Daniel J. Schaid, Jason P. Sinnwell, Gregory D. Jenkins, Shannon K. McDonnell, James N. Ingle, Michiaki Kubo, Paul E. Goss, Joseph P. Costantino, D. Lawrence Wickerham, Richard M. Weinshilboum

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses.

Original languageEnglish
Pages (from-to)3-16
Number of pages14
JournalGenetic Epidemiology
Volume36
Issue number1
DOIs
Publication statusPublished - 01-01-2012
Externally publishedYes

Fingerprint

Gene Ontology
Genome-Wide Association Study
Genes
Single Nucleotide Polymorphism
Linkage Disequilibrium
Encyclopedias

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Cite this

Schaid, D. J., Sinnwell, J. P., Jenkins, G. D., McDonnell, S. K., Ingle, J. N., Kubo, M., ... Weinshilboum, R. M. (2012). Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies. Genetic Epidemiology, 36(1), 3-16. https://doi.org/10.1002/gepi.20632
Schaid, Daniel J. ; Sinnwell, Jason P. ; Jenkins, Gregory D. ; McDonnell, Shannon K. ; Ingle, James N. ; Kubo, Michiaki ; Goss, Paul E. ; Costantino, Joseph P. ; Wickerham, D. Lawrence ; Weinshilboum, Richard M. / Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies. In: Genetic Epidemiology. 2012 ; Vol. 36, No. 1. pp. 3-16.
@article{73f661d309e3493eaff1e9882f6ca906,
title = "Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies",
abstract = "Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc {"}fixes.{"} To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses.",
author = "Schaid, {Daniel J.} and Sinnwell, {Jason P.} and Jenkins, {Gregory D.} and McDonnell, {Shannon K.} and Ingle, {James N.} and Michiaki Kubo and Goss, {Paul E.} and Costantino, {Joseph P.} and Wickerham, {D. Lawrence} and Weinshilboum, {Richard M.}",
year = "2012",
month = "1",
day = "1",
doi = "10.1002/gepi.20632",
language = "English",
volume = "36",
pages = "3--16",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "1",

}

Schaid, DJ, Sinnwell, JP, Jenkins, GD, McDonnell, SK, Ingle, JN, Kubo, M, Goss, PE, Costantino, JP, Wickerham, DL & Weinshilboum, RM 2012, 'Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies', Genetic Epidemiology, vol. 36, no. 1, pp. 3-16. https://doi.org/10.1002/gepi.20632

Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies. / Schaid, Daniel J.; Sinnwell, Jason P.; Jenkins, Gregory D.; McDonnell, Shannon K.; Ingle, James N.; Kubo, Michiaki; Goss, Paul E.; Costantino, Joseph P.; Wickerham, D. Lawrence; Weinshilboum, Richard M.

In: Genetic Epidemiology, Vol. 36, No. 1, 01.01.2012, p. 3-16.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies

AU - Schaid, Daniel J.

AU - Sinnwell, Jason P.

AU - Jenkins, Gregory D.

AU - McDonnell, Shannon K.

AU - Ingle, James N.

AU - Kubo, Michiaki

AU - Goss, Paul E.

AU - Costantino, Joseph P.

AU - Wickerham, D. Lawrence

AU - Weinshilboum, Richard M.

PY - 2012/1/1

Y1 - 2012/1/1

N2 - Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses.

AB - Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses.

UR - http://www.scopus.com/inward/record.url?scp=84859100107&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859100107&partnerID=8YFLogxK

U2 - 10.1002/gepi.20632

DO - 10.1002/gepi.20632

M3 - Article

VL - 36

SP - 3

EP - 16

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 1

ER -