ObjectivesWe present an illustrative application of methods that account for covariates in receiver operating characteristic (ROC) curve analysis, using individual patient data on D-dimer testing... Show moreObjectivesWe present an illustrative application of methods that account for covariates in receiver operating characteristic (ROC) curve analysis, using individual patient data on D-dimer testing for excluding pulmonary embolism.Study Design and SettingBayesian nonparametric covariate-specific ROC curves were constructed to examine the performance/positivity thresholds in covariate subgroups. Standard ROC curves were constructed. Three scenarios were outlined based on comparison between subgroups and standard ROC curve conclusion: (1) identical distribution/identical performance, (2) different distribution/identical performance, and (3) different distribution/different performance. Scenarios were illustrated using clinical covariates. Covariate-adjusted ROC curves were also constructed.ResultsAge groups had prominent differences in D-dimer concentration, paired with differences in performance (Scenario 3). Different positivity thresholds were required to achieve the same level of sensitivity. D-dimer had identical performance, but different distributions for YEARS algorithm items (Scenario 2), and similar distributions for sex (Scenario 1). For the later covariates, comparable positivity thresholds achieved the same sensitivity. All covariate-adjusted models had AUCs comparable to the standard approach.ConclusionSubgroup differences in performance and distribution of results can indicate that the conventional ROC curve is not a fair representation of test performance. Estimating conditional ROC curves can improve the ability to select thresholds with greater applicability. Show less
Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently... Show moreBackground: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N=1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk. Show less
A major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million... Show moreA major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. We highlight two of the prioritized genes, CREBRF and RRBP1, which show convergent evidence across functional datasets supporting their roles in lipid biology. Show less
Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions... Show moreCommon single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes1. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel2) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries. Show less
Increased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in... Show moreIncreased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use(1). Despite advances in prevention and treatment, in particular through reducing low-density lipoprotein cholesterol levels(2), heart disease remains the leading cause of death worldwide(3). Genome-wideassociation studies (GWAS) of blood lipid levels have led to important biological and clinical insights, as well as new drug targets, for cardiovascular disease. However, most previous GWAS(4-23) have been conducted in European ancestry populations and may have missed genetic variants that contribute to lipid-level variation in other ancestry groups. These include differences in allele frequencies, effect sizes and linkage-disequilibrium patterns(24). Here we conduct a multi-ancestry, genome-wide genetic discovery meta-analysis of lipid levels in approximately 1.65 million individuals, including 350,000 of non-European ancestries. We quantify the gain in studying non-European ancestries and provide evidence to support the expansion of recruitment of additional ancestries, even with relatively small sample sizes. We find that increasing diversity rather than studying additional individuals of European ancestry results in substantial improvements in fine-mapping functional variants and portability of polygenic prediction (evaluated in approximately 295,000 individuals from 7 ancestry groupings). Modest gains in the number of discovered loci and ancestry-specific variants were also achieved. As GWAS expand emphasis beyond the identification of genes and fundamental biology towards the use of genetic variants for preventive and precision medicine(25), we anticipate that increased diversity of participants will lead to more accurate and equitable(26) application of polygenic scores in clinical practice. Show less
A broad-based interlaboratory study of glycosylation profiles of a reference and modified IgG antibody involving 103 reports from 76 laboratories.Glycosylation is a topic of intense current... Show moreA broad-based interlaboratory study of glycosylation profiles of a reference and modified IgG antibody involving 103 reports from 76 laboratories.Glycosylation is a topic of intense current interest in the development of biopharmaceuticals because it is related to drug safety and efficacy. This work describes results of an interlaboratory study on the glycosylation of the Primary Sample (PS) of NISTmAb, a monoclonal antibody reference material. Seventy-six laboratories from industry, university, research, government, and hospital sectors in Europe, North America, Asia, and Australia submitted a total of 103 reports on glycan distributions. The principal objective of this study was to report and compare results for the full range of analytical methods presently used in the glycosylation analysis of mAbs. Therefore, participation was unrestricted, with laboratories choosing their own measurement techniques. Protein glycosylation was determined in various ways, including at the level of intact mAb, protein fragments, glycopeptides, or released glycans, using a wide variety of methods for derivatization, separation, identification, and quantification. Consequently, the diversity of results was enormous, with the number of glycan compositions identified by each laboratory ranging from 4 to 48. In total, one hundred sixteen glycan compositions were reported, of which 57 compositions could be assigned consensus abundance values. These consensus medians provide community-derived values for NISTmAb PS. Agreement with the consensus medians did not depend on the specific method or laboratory type. The study provides a view of the current state-of-the-art for biologic glycosylation measurement and suggests a clear need for harmonization of glycosylation analysis methods. Show less
Objective. There is a need to develop and validate biomarkers for treatment response and survival in tubo-ovarian high-grade serous carcinoma (HGSC). The chemotherapy response score (CRS)... Show moreObjective. There is a need to develop and validate biomarkers for treatment response and survival in tubo-ovarian high-grade serous carcinoma (HGSC). The chemotherapy response score (CRS) stratifies patients into complete/near-complete (CRS3), partial (CRS2), and no/minimal (CRS1) response after neoadjuvant chemotherapy (NACT). Our aim was to review current evidence to determine whether the CRS is prognostic in women with tubo-ovarian HGSC treated with NACT.Methods. We established an international collaboration to conduct a systematic review and meta-analysis, pooling individual patient data from 16 sites in 11 countries. Patients had stage IIIC/IV HGSC, 3-4 NACT cycles and >6-months follow-up. Random effects models were used to derive combined odds ratios in the pooled population to investigate associations between CRS and progression free and overall survival (PFS and OS).Results. 877 patients were included from published and unpublished studies. Median PFS and OS were 15 months (IQR 5-65) and 28 months (IQR 7-92) respectively. CRS3 was seen in 249 patients (28%). The pooled hazard ratios (HR) for PFS and OS for CRS3 versus CRS1/CRS2 were 0.55 (95% CI, 0.45-0.66; P < 0.001) and 0.65 (95% CI 0.50-0.85, P = 0.002) respectively; no heterogeneity was identified (PFS: Q = 6.42, P = 0.698, I2 = 0.0%; OS: Q = 6.89, P = 0 648, I2 = 0.0%). CRS was significantly associated with PFS and OS in multivariate models adjusting for age and stage. Of 306 patients with known germline BRCA1/2 status, those with BRCA1/2 mutations (n = 80) were more likely to achieve CRS3 (P = 0.027).Conclusions. CRS3 was significantly associated with improved PFS and OS compared to CRS1/2. This validation of CRS in a real-world setting demonstrates it to be a robust and reproducible biomarker with potential to be incorporated into therapeutic decision-making and clinical trial design. (C) 2019 The Authors. Published by Elsevier Inc. Show less
Academic papers have been played as a protagonist to disseminate the expertise. Naturally, analysing paper citation pattern is an efficient and essential mean for investigating the knowledge... Show moreAcademic papers have been played as a protagonist to disseminate the expertise. Naturally, analysing paper citation pattern is an efficient and essential mean for investigating the knowledge structure of science and technology. For decades, it has been observed that citation of scientific literature follows a heterogeneous and fat-tailed distribution, and many of them suggest power-law distribution or its siblings. However, many studies are limited to small-scale approaches; it is thus hard to generalize. Tackling this issue, we investigate 21 years of citation evolution through systematic analysis entire citation history of 42,423,644 scientific literature published from 1996 to 2016 contained in SCOPUS. We tested six candidate distributions for the papers in three distinct levels of SJR (Scimago Journal & Country Rank) classification scheme. First, we observe the raw number of annual citation acquisitions tend to follow the log-normal distribution for all disciplines, except the first year of the publication. We also observe the large disparity of yearly acquired citation number among the journals, which suggests that it is essential to remove the citation surplus inherited from the prestige of the journals. Our simple method that separating citation preference of individual article from the inherited citation of the journals reveals unexpected regularity on the normalized annual acquisitions of citation for the entire fields of science. Specifically, the probability distributions of annual citation acquisitions behave as power-law with an exponential cut-off of the exponents around 2.3, regardless of its publication and citation year. Our result implies that an intensity of attention for a scientific article also follows a fat-tailed distribution, which is power-law with an exponent cut-off. Show less
The process of withdrawing publications is itself the inherent process of scholarly communication. The problem is how retracted publications are managed in the bibliographic database and used by... Show moreThe process of withdrawing publications is itself the inherent process of scholarly communication. The problem is how retracted publications are managed in the bibliographic database and used by researchers with citation. The increase of retraction is relevant to the reliability and reproducibility of the scientific research results. In this study, we review the main features of the retracted publications in Korea and examine how the retracted publications are managed in two frequently used bibliographic databases (Web of Science & Korea Citation Index) in Korea. Finally, we compare the times cited before and after retraction to understand how researchers cite the retracted publications. Show less
The application of scientometric indicators for research evaluation has been carried out generally. In keeping with these trends, Korea Institute of Science and Technology Information(KISTI)... Show moreThe application of scientometric indicators for research evaluation has been carried out generally. In keeping with these trends, Korea Institute of Science and Technology Information(KISTI) developed the ‘Insightful Integrated Indicators-Metrics(i*Metrics)’ to calculate scientometric indicators for journal papers analysis. In this study, we investigate the research trend and characteristics by country in the field of gender studies using journal papers published from 1999 to 2016 by KISTI i*Metrics system with selected indicators. As a result of the analysis, global gender studies was conducted under the leadership of Unites States and European countries such as United Kingdom, Sweden, and Germany. In addition, the interdisciplinary characteristics of gender studies have been confirmed through indicator combination analysis. Show less