Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions... Show moreCommon single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes1. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel2) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries. Show less
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality. To advance our understanding of the biology contributing to VTE, we conducted a genome-wide association study ... Show moreVenous thromboembolism (VTE) is a significant contributor to morbidity and mortality. To advance our understanding of the biology contributing to VTE, we conducted a genome-wide association study (GWAS) of VTE and a transcriptome-wide association study (TWAS) based on imputed gene expression from whole blood and liver. Wemeta-analyzedGWAS data from18 studies for 30 234 VTE cases and 172 122 controls and assessed the association between 12 923 718 genetic variants and VTE. We generated variant prediction scores of gene expression from whole blood and liver tissue and assessed them for association with VTE. Mendelian randomization analyses were conducted for traits genetically associated with novel VTE loci. We identified 34 independent genetic signals for VTE risk from GWAS meta-analysis, of which 14 are newly reported associations. This included 11 newly associated genetic loci (C1orf198, PLEK, OSMR-AS1, NUGGC/SCARA5, GRK5, MPHOSPH9, ARID4A, PLCG2, SMG6, EIF5A, and STX10) of which 6 replicated, and 3 new independent signals in 3 known genes. Further, TWAS identified 5 additional genetic loci with imputed gene expression levels differing between cases and controls in whole blood (SH2B3, SPSB1, RP11-747H7.3, RP4-737E23.2) and in liver (ERAP1). At some GWAS loci, we found suggestive evidence that the VTE association signal for novel and previously known regions colocalized with expression quantitative trait locus signals. Mendelian randomization analyses suggested that blood traits may contribute to the underlying risk of VTE. To conclude, we identified 16 novel susceptibility loci for VTE; for some loci, the association signals are likely mediated through gene expression of nearby genes. Show less