Background DNA methylation is a key epigenetic modification in human development and disease, yet there is limited understanding of its highly coordinated regulation. Here, we identify 818 genes... Show moreBackground DNA methylation is a key epigenetic modification in human development and disease, yet there is limited understanding of its highly coordinated regulation. Here, we identify 818 genes that affect DNA methylation patterns in blood using large-scale population genomics data. Results By employing genetic instruments as causal anchors, we establish directed associations between gene expression and distant DNA methylation levels, while ensuring specificity of the associations by correcting for linkage disequilibrium and pleiotropy among neighboring genes. The identified genes are enriched for transcription factors, of which many consistently increased or decreased DNA methylation levels at multiple CpG sites. In addition, we show that a substantial number of transcription factors affected DNA methylation at their experimentally determined binding sites. We also observe genes encoding proteins with heterogenous functions that have widespread effects on DNA methylation, e.g.,NFKBIE,CDCA7(L), andNLRC5, and for several examples, we suggest plausible mechanisms underlying their effect on DNA methylation. Conclusion We report hundreds of genes that affect DNA methylation and provide key insights in the principles underlying epigenetic regulation. Show less
Insights into individual differences in gene expression and its heritability (h(2)) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52... Show moreInsights into individual differences in gene expression and its heritability (h(2)) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52,844 genes measured in whole blood in the largest twin RNA-Seq sample to date (1497 individuals including 459 monozygotic twin pairs and 150 dizygotic twin pairs) from classical twin modeling and identity-by-state-based approaches. We estimated for each gene h(total)(2), composed of cis-heritability (h(cis)(2), the variance explained by single nucleotide polymorphisms in the cis-window of the gene), and trans-heritability (h(res)(2), the residual variance explained by all other genome-wide variants). Mean h(total)(2) was 0.26, which was significantly higher than heritability estimates earlier found in a microarray-based study using largely overlapping (>60%) RNA samples (mean h(2) = 0.14, p = 6.15 x 10(-258)). Mean h(cis)(2) was 0.06 and strongly correlated with beta of the top cis expression quantitative loci (eQTL, rho = 0.76, p < 10(-308)) and with estimates from earlier RNA-Seq-based studies. Mean h(res)(2) was 0.20 and correlated with the beta of the corresponding trans-eQTL (rho = 0.04, p < 1.89 x 10(-3)) and was significantly higher for genes involved in cytokine-cytokine interactions (p = 4.22 x 10(-15)), many other immune system pathways, and genes identified in genome-wide association studies for various traits including behavioral disorders and cancer. This study provides a thorough characterization of cis- and trans-h(2) estimates of gene expression, which is of value for interpretation of GWAS and gene expression studies. Show less