Comprehensive functional annotation of susceptibility variants identifies genetic heterogeneity between lung adenocarcinoma and squamous cell carcinoma

Research output: Contribution to journalJournal articleResearchpeer-review

  • Na Qin
  • Yuancheng Li
  • Cheng Wang
  • Meng Zhu
  • Juncheng Dai
  • Tongtong Hong
  • Demetrius Albanes
  • Stephen Lam
  • Adonina Tardon
  • Chu Chen
  • Gary Goodman
  • Maria Teresa Landi
  • Mattias Johansson
  • Angela Risch
  • H. Erich Wichmann
  • Heike Bickeboller
  • Gadi Rennert
  • Susanne Arnold
  • Paul Brennan
  • John K. Field
  • Sanjay Shete
  • Loic Le Marchand
  • Olle Melander
  • Hans Brunnstrom
  • Geoffrey Liu
  • Rayjean J. Hung
  • Angeline Andrew
  • Lambertus A. Kiemeney
  • Shan Zienolddiny
  • Kjell Grankvist
  • Mikael Johansson
  • Neil Caporaso
  • Penella Woll
  • Philip Lazarus
  • Matthew B. Schabath
  • Melinda C. Aldrich
  • Victoria L. Stevens
  • Guangfu Jin
  • David C. Christiani
  • Zhibin Hu
  • Christopher I. Amos
  • Hongxia Ma
  • Hongbing Shen

Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer (NSCLC) risk, biological mechanisms of these variants remain largely unknown. By integrating a large-scale genotype data of 15 581 lung adenocarcinoma (AD) cases, 8350 squamous cell carcinoma (SqCC) cases, and 27 355 controls, as well as multiple transcriptome and epigenomic databases, we conducted histology-specific meta-analyses and functional annotations of both reported and novel susceptibility variants. We identified 3064 credible risk variants for NSCLC, which were overrepresented in enhancer-like and promoter-like histone modification peaks as well as DNase I hypersensitive sites. Transcription factor enrichment analysis revealed that USF1 was AD-specific while CREB1 was SqCC-specific. Functional annotation and gene-based analysis implicated 894 target genes, including 274 specifics for AD and 123 for SqCC, which were overrepresented in somatic driver genes (ER = 1.95, P = 0.005). Pathway enrichment analysis and Gene-Set Enrichment Analysis revealed that AD genes were primarily involved in immune-related pathways, while SqCC genes were homologous recombination deficiency related. Our results illustrate the molecular basis of both well-studied and new susceptibility loci of NSCLC, providing not only novel insights into the genetic heterogeneity between AD and SqCC but also a set of plausible gene targets for post-GWAS functional experiments.

Original languageEnglish
JournalFrontiers of Medicine
Volume15
Issue number2
Pages (from-to)275-291
ISSN2095-0217
DOIs
Publication statusPublished - 2021

Bibliographical note

Funding Information:
This study was supported by the Key International (Regional) Cooperative Research Project (No. 81820108028), the National Natural Science Foundation of China (Nos. 81521004, 81922061, 81973123, and 81803306), the Science Foundation for Distinguished Young Scholars of Jiangsu (No. BK20160046), and the Priority Academic Program for the Development of Jiangsu Higher Education Institutions (Public Health and Preventive Medicine). CARET is funded by the National Cancer Institute, National Institutes of Health of USA through grants U01-CA063673, UM1-CA167462, and U01-CA167462.

Publisher Copyright:
© 2020, Higher Education Press.

    Research areas

  • function annotation, genetic heterogeneity, genome-wide association study, homologous recombination repair deficiency, immune, lung cancer

ID: 301346773