USF1 (upstream stimulatory factor 1) is a transcription factor associated with familial combined hyperlipidemia and coronary artery disease in humans. However, whether USF1 is beneficial or... Show moreUSF1 (upstream stimulatory factor 1) is a transcription factor associated with familial combined hyperlipidemia and coronary artery disease in humans. However, whether USF1 is beneficial or detrimental to cardiometabolic health has not been addressed. By inactivating USF1 in mice, we demonstrate protection against diet-induced dyslipidemia, obesity, insulin resistance, hepatic steatosis, and atherosclerosis. The favorable plasma lipid profile, including increased high-density lipoprotein cholesterol and decreased triglycerides, was coupled with increased energy expenditure due to activation of brown adipose tissue (BAT). Usf1 inactivation directs triglycerides from the circulation to BAT for combustion via a lipoprotein lipase-dependent mechanism, thus enhancing plasma triglyceride clearance. Mice lacking Usf1 displayed increased BAT-facilitated, diet-induced thermogenesis with up-regulation of mitochondrial respiratory chain complexes, as well as increased BAT activity even at thermoneutrality and after BAT sympathectomy. A direct effect of USF1 on BAT activation was demonstrated by an amplified adrenergic response in brown adipocytes after Usf1 silencing, and by augmented norepinephrine-induced thermogenesis in mice lacking Usf1. In humans, individuals carrying SNP (single-nucleotide polymorphism) alleles that reduced USF1 mRNA expression also displayed a beneficial cardiometabolic profile, featuring improved insulin sensitivity, a favorable lipid profile, and reduced atherosclerosis. Our findings identify a new molecular link between lipid metabolism and energy expenditure, and point to the potential of USF1 as a therapeutic target for cardiometabolic disease. Show less
Migraine is a common episodic neurological disorder, typically presenting with recurrent attacks of severe headache and autonomic dysfunction. Apart from rare monogenic subtypes, no genetic or... Show moreMigraine is a common episodic neurological disorder, typically presenting with recurrent attacks of severe headache and autonomic dysfunction. Apart from rare monogenic subtypes, no genetic or molecular markers for migraine have been convincingly established. We identified the minor allele of rs1835740 on chromosome 8q22.1 to be associated with migraine (P = 5.38 x 10(-9), odds ratio = 1.23, 95% CI 1.150-1.324) in a genome-wide association study of 2,731 migraine cases ascertained from three European headache clinics and 10,747 population-matched controls. The association was replicated in 3,202 cases and 40,062 controls for an overall meta-analysis P value of 1.69 x 10(-11) (odds ratio = 1.18, 95% CI 1.127-1.244). rs1835740 is located between MTDH (astrocyte elevated gene 1, also known as AEG-1) and PGCP (encoding plasma glutamate carboxypeptidase). In an expression quantitative trait study in lymphoblastoid cell lines, transcript levels of the MTDH were found to have a significant correlation to rs1835740 (P = 3.96 x 10(-5), permuted threshold for genome-wide significance 7.7 x 10(-5)). To our knowledge, our data establish rs1835740 as the first genetic risk factor for migraine. Show less
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we... Show moreThe 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. Show less
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have... Show moreMost common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits(1), but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait(2,3). The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P<0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways. Show less
Inouye, M.; Silander, K.; Hamalainen, E.; Salomaa, V.; Harald, K.; Jousilahti, P.; ... ; Peltonen, L. 2010
While recent scans for genetic variation associated with human disease have been immensely successful in uncovering large numbers of loci, far fewer studies have focused on the underlying pathways... Show moreWhile recent scans for genetic variation associated with human disease have been immensely successful in uncovering large numbers of loci, far fewer studies have focused on the underlying pathways of disease pathogenesis. Many loci which are associated with disease and complex phenotypes map to non-coding, regulatory regions of the genome, indicating that modulation of gene transcription plays a key role. Thus, this study generated genome-wide profiles of both genetic and transcriptional variation from the total blood extracts of over 500 randomly-selected, unrelated individuals. Using measurements of blood lipids, key players in the progression of atherosclerosis, three levels of biological information are integrated in order to investigate the interactions between circulating leukocytes and proximal lipid compounds. Pair-wise correlations between gene expression and lipid concentration indicate a prominent role for basophil granulocytes and mast cells, cell types central to powerful allergic and inflammatory responses. Network analysis of gene co-expression showed that the top associations function as part of a single, previously unknown gene module, the Lipid Leukocyte (LL) module. This module replicated in T cells from an independent cohort while also displaying potential tissue specificity. Further, genetic variation driving LL module expression included the single nucleotide polymorphism (SNP) most strongly associated with serum immunoglobulin E (IgE) levels, a key antibody in allergy. Structural Equation Modeling (SEM) indicated that LL module is at least partially reactive to blood lipid levels. Taken together, this study uncovers a gene network linking blood lipids and circulating cell types and offers insight into the hypothesis that the inflammatory response plays a prominent role in metabolism and the potential control of atherogenesis. Show less
To get beyond the "low-hanging fruits'' so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of... Show moreTo get beyond the "low-hanging fruits'' so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24-28 years, 15.4 kg mean weight difference) and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N=77). Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N=21,000) revealed a significant deviation of P-values from the expected (P=4x10(-4)). A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of similar to 2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity. Show less
Smoking is a common risk factor for many diseases(1). We conducted genome-wide association meta-analyses for the number of cigarettes smoked per day (CPD) in smokers (n = 31,266) and smoking... Show moreSmoking is a common risk factor for many diseases(1). We conducted genome-wide association meta-analyses for the number of cigarettes smoked per day (CPD) in smokers (n = 31,266) and smoking initiation (n = 46,481) using samples from the ENGAGE Consortium. In a second stage, we tested selected SNPs with in silico replication in the Tobacco and Genetics (TAG) and Glaxo Smith Kline (Ox-GSK) consortia cohorts (n = 45,691 smokers) and assessed some of those in a third sample of European ancestry (n = 9,040). Variants in three genomic regions associated with CPD (P < 5 x 10(-8)), including previously identified SNPs at 15q25 represented by rs1051730[A] (effect size = 0.80 CPD, P = 2.4 x 10(-69)), and SNPs at 19q13 and 8p11, represented by rs4105144[C] (effect size = 0.39 CPD, P = 2.2 x 10(-12)) and rs6474412-T (effect size = 0.29 CPD, P = 1.4 x 10(-8)), respectively. Among the genes at the two newly associated loci are genes encoding nicotine-metabolizing enzymes (CYP2A6 and CYP2B6) and nicotinic acetylcholine receptor subunits (CHRNB3 and CHRNA6), all of which have been highlighted in previous studies of smoking and nicotine dependence2-4. Nominal associations with lung cancer were observed at both 8p11 (rs6474412[T], odds ratio (OR) = 1.09, P = 0.04) and 19q13 (rs4105144[C], OR = 1.12, P = 0.0006). Show less
Chambers, J.C.; Zhang, W.H.; Lord, G.M.; Harst, P. van der; Lawlor, D.A.; Sehmi, J.S.; ... ; Kooner, J.S. 2010
Using genome-wide association, we identify common variants at 2p12-p13, 6q26, 17q23 and 19q13 associated with serum creatinine, a marker of kidney function (P = 10(-10) to 10(-15)). Of these,... Show moreUsing genome-wide association, we identify common variants at 2p12-p13, 6q26, 17q23 and 19q13 associated with serum creatinine, a marker of kidney function (P = 10(-10) to 10(-15)). Of these, rs10206899 (near NAT8, 2p12-p13) and rs4805834 (near SLC7A9, 19q13) were also associated with chronic kidney disease (P = 5.0 x 10(-5) and P = 3.6 x 10(-4), respectively). Our findings provide insight into metabolic, solute and drug-transport pathways underlying susceptibility to chronic kidney disease. Show less
We performed a second-generation genome-wide association study of 4,533 individuals with celiac disease (cases) and 10,750 control subjects. We genotyped 113 selected SNPs with P-GWAS < 10(-4)... Show moreWe performed a second-generation genome-wide association study of 4,533 individuals with celiac disease (cases) and 10,750 control subjects. We genotyped 113 selected SNPs with P-GWAS < 10(-4) and 18 SNPs from 14 known loci in a further 4,918 cases and 5,684 controls. Variants from 13 new regions reached genome-wide significance (P-combined < 5 x 10(-8)); most contain genes with immune functions (BACH2, CCR4, CD80, CIITA-SOCS1-CLEC16A, ICOSLG and ZMIZ1), with ETS1, RUNX3, THEMIS and TNFRSF14 having key roles in thymic T-cell selection. There was evidence to suggest associations for a further 13 regions. In an expression quantitative trait meta-analysis of 1,469 whole blood samples, 20 of 38 (52.6%) tested loci had celiac risk variants correlated (P < 0.0028, FDR 5%) with cis gene expression. Show less