Measurement error in a covariate or the outcome of regression models is common, but is often ignored, even though measurement error can lead to substantial bias in the estimated covariate-outcome... Show moreMeasurement error in a covariate or the outcome of regression models is common, but is often ignored, even though measurement error can lead to substantial bias in the estimated covariate-outcome association. While several texts on measurement error correction methods are available, these methods remain seldomly applied. To improve the use of measurement error correction methodology, we developed mecor , an R package that implements measurement error correction methods for regression models with a continuous outcome. Measurement error correction requires information about the measurement error model and its parameters. This information can be obtained from four types of studies, used to estimate the parameters of the measurement error model: an internal validation study, a replicates study, a calibration study and an external validation study. In the package mecor , regression calibration methods and a maximum likelihood method are implemented to correct for measurement error in a continuous covariate in regression analyses. Additionally, methods of moments methods are implemented to correct for measurement error in the continuous outcome in regression analyses. Variance estimation of the corrected estimators is provided in closed form and using the bootstrap. (c) 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) Show less
While mass spectrometry still dominates proteomics research, alternative and potentially disruptive, next-generation technologies are receiving increased investment and attention. Most of these... Show moreWhile mass spectrometry still dominates proteomics research, alternative and potentially disruptive, next-generation technologies are receiving increased investment and attention. Most of these technologies aim at the sequencing of single peptide or protein molecules, typically labeling or otherwise distinguishing a subset of the proteinogenic amino acids. This note considers some theoretical aspects of these future technologies from a bottom-up proteomics viewpoint, including the ability to uniquely identify human proteins as a function of which and how many amino acids can be read, enzymatic efficiency, and the maximum read length. This is done through simulations under ideal and non-ideal conditions to set benchmarks for what may be achievable with future single-molecule sequencing technology. The simulations reveal, among other observations, that the best choice of reading N amino acids performs similarly to the average choice of N+1 amino acids, and that the discrimination power of the amino acids scales with their frequency in the proteome. The simulations are agnostic with respect to the next-generation proteomics platform, and the results and conclusions should therefore be applicable to any single-molecule partial peptide sequencing technology. Show less
This dissertation presents methodological advances in the field of frailty models. Time to event data is very common in biomedical applications. Multivariate data, such as recurrent event... Show moreThis dissertation presents methodological advances in the field of frailty models. Time to event data is very common in biomedical applications. Multivariate data, such as recurrent event histories or clustered failures, typically require special modeling techniques. Frailty models extend the proportional hazards models by employing random effects. In this book, the first chapter represents an introduction to the field of frailty models. The second chapter studies the small sample properties of such models in more depth. In particular, it addresses the questions of identifiability of frailty models. The third chapter describes a score test that may be used to test whether a recurrent events process is associated with a terminal event. The fourth chapter studies the problem of event-dependent selection in the context of recurrent events data. This problem is relevant when the data are collected from registries. The fifth chapter introduces a novel R package and presents an overview of available software for estimating frailty models. Show less
Guler, A.T.; Waaijer, C.J.F.; Mohammed, Y.; Palmblad, M. 2016
Quantitative analysis of the scientific literature is a frequent task in bibliometrics. Several large online resources collect and disseminate bibliographic information, paving the way for broad... Show moreQuantitative analysis of the scientific literature is a frequent task in bibliometrics. Several large online resources collect and disseminate bibliographic information, paving the way for broad analyses and statistics. The Europe PubMed Central (PMC) and its Web Services is one of these resources, providing a rich platform to retrieve information and metadata on scientific publications. However, a complete bibliometric analysis that involves gathering information and deriving statistics on an author, topic, or country is laborious when consuming Web Services on the command-line or using low level automation. In contrast, scientific workflow managers can integrate different types of software tools to automate multi-step processes. The Taverna workflow engine is a popular open-source scientific workflow manager, giving easy access to availableWeb Services. In this tutorial, we demonstrate how to design scientific workflows for bibliometric analyses in Taverna by integrating Europe PubMed Central Web Services and statistical analysis tools. To our knowledge, this is also the first time scientific workflow managers have been used to perform bibliometric analyses using these Web Services. Show less
Guler, A.T.; Waaijer, C.J.F.; Mohammed, Y.; Palmblad, M. 2016
Quantitative analysis of the scientific literature is a frequent task in bibliometrics. Several large online resources collect and disseminate bibliographic information, paving the way for broad... Show moreQuantitative analysis of the scientific literature is a frequent task in bibliometrics. Several large online resources collect and disseminate bibliographic information, paving the way for broad analyses and statistics. The Europe PubMed Central (PMC) and its Web Services is one of these resources, providing a rich platform to retrieve information and metadata on scientific publications. However, a complete bibliometric analysis that involves gathering information and deriving statistics on an author, topic, or country is laborious when consuming Web Services on the command-line or using low level automation. In contrast, scientific workflow managers can integrate different types of software tools to automate multi-step processes. The Taverna workflow engine is a popular open-source scientific workflow manager, giving easy access to available Web Services. In this tutorial, we demonstrate how to design scientific workflows for bibliometric analyses in Taverna by integrating Europe PubMed Central Web Services and statistical analysis tools. To our knowledge, this is also the first time scientific workflow managers have been used to perform bibliometric analyses using these Web Services. (C) 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Show less
Scientific workflows organize the assembly of specialized software into an overall data flow and are particularly well suited for multi-step analyses using different types of software tools. They... Show moreScientific workflows organize the assembly of specialized software into an overall data flow and are particularly well suited for multi-step analyses using different types of software tools. They are also favorable in terms of reusability, as previously designed workflows could be made publicly available through the myExperiment community and then used in other workflows. We here illustrate how scientific workflows and the Taverna workbench in particular can be used in bibliometrics. We discuss the specific capabilities of Taverna that makes this software a powerful tool in this field, such as automated data import via Web services, data extraction from XML by XPaths, and statistical analysis and visualization with R. The support of the latter is particularly relevant, as it allows integration of a number of recently developed R packages specifically for bibliometrics. Examples are used to illustrate the possibilities of Taverna in the fields of bibliometrics and scientometrics. Show less