Redefining replication in multi-ancestry genome-wide association studies

  • Smith Sp ,
  • Shahamatdar S ,
  • Cheng W ,
  • Zhang S ,
  • Paik J ,
  • Graff M ,
  • Haiman C ,
  • Matise T ,
  • North Ke ,
  • Peters U ,
  • Kenny E ,
  • Gignoux C ,
  • Wojcik G ,
  • ,
  • Ramachandran S

bioRxiv |

Abstract Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from European ancestry individuals apply heterogeneously in non-European ancestry individuals. Here, we argue that enrichment analyses which aggregate SNP-level association statistics at multiple genomic scales—to genes and pathways—have been overlooked and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the insights generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven self-identified human ancestries in the UK Biobank and the Biobank Japan, as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African-American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals. By testing for statistical associations at multiple genomic scales, enrichment analyses also illustrate the importance of reconciling contrasting results from association tests, heritability estimation, and prediction models in order to make personalized medicine a reality for all.