We conducted genome-wide association studies (GWAS) of relative intake from the macronutrients fat, protein, carbohydrates, and sugar in over 235,000 individuals of European ancestries. We... Show moreWe conducted genome-wide association studies (GWAS) of relative intake from the macronutrients fat, protein, carbohydrates, and sugar in over 235,000 individuals of European ancestries. We identified 21 unique, approximately independent lead SNPs. Fourteen lead SNPs are uniquely associated with one macronutrient at genome-wide significance (P < 5 x 10(-8)), while five of the 21 lead SNPs reach suggestive significance (P < 1 x 10(-5)) for at least one other macronutrient. While the phenotypes are genetically correlated, each phenotype carries a partially unique genetic architecture. Relative protein intake exhibits the strongest relationships with poor health, including positive genetic associations with obesity, type 2 diabetes, and heart disease (r(g) approximate to 0.15-0.5). In contrast, relative carbohydrate and sugar intake have negative genetic correlations with waist circumference, waist-hip ratio, and neighborhood deprivation (|r(g)| approximate to 0.1-0.3) and positive genetic correlations with physical activity (r(g) approximate to 0.1 and 0.2). Relative fat intake has no consistent pattern of genetic correlations with poor health but has a negative genetic correlation with educational attainment (r(g) approximate to-0.1). Although our analyses do not allow us to draw causal conclusions, we find no evidence of negative health consequences associated with relative carbohydrate, sugar, or fat intake. However, our results are consistent with the hypothesis that relative protein intake plays a role in the etiology of metabolic dysfunction. Show less
Motivation: The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. These data enable users to calculate temporal trends in biodiversity... Show moreMotivation: The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. These data enable users to calculate temporal trends in biodiversity within and amongst assemblages using a broad range of metrics. BioTIME is being developed as a community-led open-source database of biodiversity time series. Our goal is to accelerate and facilitate quantitative analysis of temporal patterns of biodiversity in the Anthropocene.Main types of variables included: The database contains 8,777,413 species abundance records, from assemblages consistently sampled for a minimum of 2 years, which need not necessarily be consecutive. In addition, the database contains metadata relating to sampling methodology and contextual information about each record.Spatial location and grain: BioTIME is a global database of 547,161 unique sampling locations spanning the marine, freshwater and terrestrial realms. Grain size varies across datasets from 0.0000000158 km(2) (158 cm(2)) to 100 km(2) (1,000,000,000,000 cm(2)).Time period and grainBio: TIME records span from 1874 to 2016. The minimal temporal grain across all datasets in BioTIME is a year.Major taxa and level of measurement: BioTIME includes data from 44,440 species across the plant and animal kingdoms, ranging from plants, plankton and terrestrial invertebrates to small and large vertebrates. Show less
Cichutek, K.; Epstein, J.; Griffiths, E.; Hindawi, S.; Jivapaisarnpong, T.; Klein, H.; ... ; WHO Expert Comm Biological 2017
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide... Show moreGenomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies. Show less
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we... Show moreThe 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. Show less
L-2-Hydroxyglutaric aciduria (L2HGA) is a rare, neurometabolic disorder with an autosomal recessive mode of inheritance. Affected individuals only have neurological manifestations, including... Show moreL-2-Hydroxyglutaric aciduria (L2HGA) is a rare, neurometabolic disorder with an autosomal recessive mode of inheritance. Affected individuals only have neurological manifestations, including psychomotor retardation, cerebellar ataxia, and more variably macrocephaly, or epilepsy. The diagnosis of L2HGA can be made based on magnetic resonance imaging (MRI), biochemical analysis, and mutational analysis of L2HGDH. About 200 patients with elevated concentrations of 2-hydroxyglutarate (2HG) in the urine were referred for chiral determination of 2HG and L2HGDH mutational analysis. All patients with increased L2HG (n = 106; 83 families) were included. Clinical information on 61 patients was obtained via questionnaires. In 82 families the mutations were detected by direct sequence analysis and/or multiplex ligation dependent probe amplification (MLPA), including one case where MLPA was essential to detect the second allele. In another case RT-PCR followed by deep intronic sequencing was needed to detect the mutation. Thirty-five novel mutations as well as 35 reported mutations and 14 nondisease-related variants are reviewed and included in a novel Leiden Open source Variation Database (LOVD) for L2HGDH variants (http://www.LOVD.nl/L2HGDH). Every user can access the database and submit variants/patients. Furthermore, we report on the phenotype, including neurological manifestations and urinary levels of L2HG, and we evaluate the phenotype genotype relationship. Hum Mutat 31:380-390, 2010. (C) 2010 Wiley-Liss, Inc. Show less