Objectives: To systematically evaluate the performance of COVID-19 prognostic models and scores for mortality risk in older populations across three health-care settings: hospitals, primary care,... Show moreObjectives: To systematically evaluate the performance of COVID-19 prognostic models and scores for mortality risk in older populations across three health-care settings: hospitals, primary care, and nursing homes.Study Design and Setting: This retrospective external validation study included 14,092 older individuals of >=70 years of age with a clinical or polymerase chain reaction-confirmed COVID-19 diagnosis from March 2020 to December 2020. The six validation cohorts include three hospital-based (CliniCo, COVID-OLD, COVID-PREDICT), two primary care-based (Julius General Practitioners Network/Academisch network huisartsgeneeskunde/Network of Academic general Practitioners, PHARMO), and one nursing home cohort (YSIS) in the Netherlands. Based on a living systematic review of COVID-19 prediction models using Prediction model Risk Of Bias ASsessment Tool for quality and risk of bias assessment and considering predictor availability in validation cohorts, we selected six prognostic models predicting mortality risk in adults with COVID-19 infection (GAL-COVID-19 mortality, 4C Mortality Score, National Early Warning Score 2-extended model, Xie model, Wang clinical model, and CURB65 score). All six prognostic models were validated in the hospital cohorts and the GAL-COVID-19 mortality model was validated in all three healthcare settings. The primary outcome was in-hospital mortality for hospitals and 28-day mortality for primary care and nursing home settings. Model performance was evaluated in each validation cohort separately in terms of discrimination, calibration, and decision curves. An intercept update was performed in models indicating miscalibration followed by predictive performance re-evaluation. Main Outcome Measure: In-hospital mortality for hospitals and 28-day mortality for primary care and nursing home setting. Results: All six prognostic models performed poorly and showed miscalibration in the older population cohorts. In the hospital settings, model performance ranged from calibration-in-the-large 1.45 to 7.46, calibration slopes 0.24e0.81, and C-statistic 0.55e0.71 with 4C Mortality Score performing as the most discriminative and well-calibrated model. Performance across health-care settings was similar for the GAL-COVID-19 model, with a calibration-in-the-large in the range of 2.35 to 0.15 indicating overestimation, calibration slopes of 0.24e0.81 indicating signs of overfitting, and C-statistic of 0.55e0.71. Conclusion: Our results show that most prognostic models for predicting mortality risk performed poorly in the older population with COVID-19, in each health-care setting: hospital, primary care, and nursing home settings. Insights into factors influencing predictive model performance in the older population are needed for pandemic preparedness and reliable prognostication of health-related outcomes in this demographic Show less
Roi-Teeuw, H.M. la; Luijken, K.; Blom, M.T.; Gussekloo, J.; Mooijaart, S.P.; Polinder-Bos, H.A.; ... ; Dries, C.J. van den 2024
BackgroundDuring the COVID-19 pandemic, older patients in primary care were triaged based on their frailty or assumed vulnerability for poor outcomes, while evidence on the prognostic value of... Show moreBackgroundDuring the COVID-19 pandemic, older patients in primary care were triaged based on their frailty or assumed vulnerability for poor outcomes, while evidence on the prognostic value of vulnerability measures in COVID-19 patients in primary care was lacking. Still, knowledge on the role of vulnerability is pivotal in understanding the resilience of older people during acute illness, and hence important for future pandemic preparedness. Therefore, we assessed the predictive value of different routine care-based vulnerability measures in addition to age and sex for 28-day mortality in an older primary care population of patients with COVID-19.MethodsFrom primary care medical records using three routinely collected Dutch primary care databases, we included all patients aged 70 years or older with a COVID-19 diagnosis registration in 2020 and 2021. All-cause mortality was predicted using logistic regression based on age and sex only (basic model), and separately adding six vulnerability measures: renal function, cognitive impairment, number of chronic drugs, Charlson Comorbidity Index, Chronic Comorbidity Score, and a Frailty Index. Predictive performance of the basic model and the six vulnerability models was compared in terms of area under the receiver operator characteristic curve (AUC), index of prediction accuracy and the distribution of predicted risks.ResultsOf the 4,065 included patients, 9% died within 28 days after COVID-19 diagnosis. Predicted mortality risk ranged between 7–26% for the basic model including age and sex, changing to 4–41% by addition of comorbidity-based vulnerability measures (Charlson Comorbidity Index, Chronic Comorbidity Score), more reflecting impaired organ functioning. Similarly, the AUC of the basic model slightly increased from 0.69 (95%CI 0.66 – 0.72) to 0.74 (95%CI 0.71 – 0.76) by addition of either of these comorbidity scores. Addition of a Frailty Index, renal function, the number of chronic drugs or cognitive impairment yielded no substantial change in predictions.ConclusionIn our dataset of older COVID-19 patients in primary care, the 28-day mortality fraction was substantial at 9%. Six different vulnerability measures had little incremental predictive value in addition to age and sex in predicting short-term mortality. Show less
Geersing, G.J.; Takada, T.; Klok, F.A.; Büller, H.R.; Courtney, D.M.; Freund, Y.; ... ; Es, N. van 2024
BackgroundIn patients clinically suspected of having pulmonary embolism (PE), physicians often rely on intuitive estimation (“gestalt”) of PE presence. Although shown to be predictive, gestalt is... Show moreBackgroundIn patients clinically suspected of having pulmonary embolism (PE), physicians often rely on intuitive estimation (“gestalt”) of PE presence. Although shown to be predictive, gestalt is criticized for its assumed variation across physicians and lack of standardization.ObjectivesTo assess the diagnostic accuracy of gestalt in the diagnosis of PE and gain insight into its possible variation.MethodsWe performed an individual patient data meta-analysis including patients suspected of having PE. The primary outcome was diagnostic accuracy of gestalt for the diagnosis of PE, quantified as risk ratio (RR) between gestalt and PE based on 2-stage random-effect log-binomial meta-analysis regression as well as gestalts’ sensitivity and specificity. The variability of these measures was explored across different health care settings, publication period, PE prevalence, patient subgroups (sex, heart failure, chronic lung disease, and items of the Wells score other than gestalt), and age.ResultsWe analyzed 20 770 patients suspected of having PE from 16 original studies. The prevalence of PE in patients with and without a positive gestalt was 28.8% vs 9.1%, respectively. The overall RR was 3.02 (95% CI, 2.35-3.87), and the overall sensitivity and specificity were 74% (95% CI, 68%-79%) and 61% (95% CI, 53%-68%), respectively. Although variation was observed across individual studies (I2, 90.63%), the diagnostic accuracy was consistent across all subgroups and health care settings.ConclusionA positive gestalt was associated with a 3-fold increased risk of PE in suspected patients. Although variation was observed across studies, the RR of gestalt was similar across prespecified subgroups and health care settings, exemplifying its diagnostic value for all patients suspected of having PE. Show less
AimsRisk stratification is used for decisions regarding need for imaging in patients with clinically suspected acute pulmonary embolism (PE). The aim was to develop a clinical prediction model that... Show moreAimsRisk stratification is used for decisions regarding need for imaging in patients with clinically suspected acute pulmonary embolism (PE). The aim was to develop a clinical prediction model that provides an individualized, accurate probability estimate for the presence of acute PE in patients with suspected disease based on readily available clinical items and D-dimer concentrations.Methods and resultsAn individual patient data meta-analysis was performed based on sixteen cross-sectional or prospective studies with data from 28 305 adult patients with clinically suspected PE from various clinical settings, including primary care, emergency care, hospitalized and nursing home patients. A multilevel logistic regression model was built and validated including ten a priori defined objective candidate predictors to predict objectively confirmed PE at baseline or venous thromboembolism (VTE) during follow-up of 30 to 90 days. Multiple imputation was used for missing data. Backward elimination was performed with a P-value <0.10. Discrimination (c-statistic with 95% confidence intervals [CI] and prediction intervals [PI]) and calibration (outcome:expected [O:E] ratio and calibration plot) were evaluated based on internal-external cross-validation. The accuracy of the model was subsequently compared with algorithms based on the Wells score and D-dimer testing. The final model included age (in years), sex, previous VTE, recent surgery or immobilization, haemoptysis, cancer, clinical signs of deep vein thrombosis, inpatient status, D-dimer (in µg/L), and an interaction term between age and D-dimer. The pooled c-statistic was 0.87 (95% CI, 0.85–0.89; 95% PI, 0.77–0.93) and overall calibration was very good (pooled O:E ratio, 0.99; 95% CI, 0.87–1.14; 95% PI, 0.55–1.79). The model slightly overestimated VTE probability in the lower range of estimated probabilities. Discrimination of the current model in the validation data sets was better than that of the Wells score combined with a D-dimer threshold based on age (c-statistic 0.73; 95% CI, 0.70–0.75) or structured clinical pretest probability (c-statistic 0.79; 95% CI, 0.76–0.81).ConclusionThe present model provides an absolute, individualized probability of PE presence in a broad population of patients with suspected PE, with very good discrimination and calibration. Its clinical utility needs to be evaluated in a prospective management or impact study. Show less
Lohmann, A.; Groenwold, R.H.H.; Smeden, M. van 2023
Logistic regression is one of the most commonly used approaches to develop clinical risk prediction models. Developers of such models often rely on approaches that aim to minimize the risk of... Show moreLogistic regression is one of the most commonly used approaches to develop clinical risk prediction models. Developers of such models often rely on approaches that aim to minimize the risk of overfitting and improve predictive performance of the logistic model, such as through likelihood penalization and variance decomposition techniques. We present an extensive simulation study that compares the out-of-sample predictive performance of risk prediction models derived using the elastic net, with Lasso and ridge as special cases, and variance decomposition techniques, namely, incomplete principal component regression and incomplete partial least squares regression. We varied the expected events per variable, event fraction, number of candidate predictors, presence of noise predictors, and the presence of sparse predictors in a full-factorial design. Predictive performance was compared on measures of discrimination, calibration, and prediction error. Simulation metamodels were derived to explain the performance differences within model derivation approaches. Our results indicate that, on average, prediction models developed using penalization and variance decomposition approaches outperform models developed using ordinary maximum likelihood estimation, with penalization approaches being consistently superior over the variance decomposition approaches. Differences in performance were most pronounced on the calibration of the model. Performance differences regarding prediction error and concordance statistic outcomes were often small between approaches. The use of likelihood penalization and variance decomposition techniques methods was illustrated in the context of peripheral arterial disease. Show less
Trinks-Roerdink, E.M.; Geersing, G.J.; Hemels, M.E.W.; Gelder, I.C. van; Klok, F.A.; Smeden, M. van; ... ; Doorn, S. van 2023
Objective Patients with cancer are at increased bleeding risk, and anticoagulants increase this risk even more. Yet, validated bleeding risk models for prediction of bleeding risk in patients with... Show moreObjective Patients with cancer are at increased bleeding risk, and anticoagulants increase this risk even more. Yet, validated bleeding risk models for prediction of bleeding risk in patients with cancer are lacking. The aim of this study is to predict bleeding risk in anticoagulated patients with cancer.Methods We performed a study using the routine healthcare database of the Julius General Practitioners’ Network. Five bleeding risk models were selected for external validation. Patients with a new cancer episode during anticoagulant treatment or those initiating anticoagulation during active cancer were included. The outcome was the composite of major bleeding and clinically relevant non-major (CRNM) bleeding. Next, we internally validated an updated bleeding risk model accounting for the competing risk of death.Results The validation cohort consisted of 1304 patients with cancer, mean age 74.0±10.9 years, 52.2% males. In total 215 (16.5%) patients developed a first major or CRNM bleeding during a mean follow-up of 1.5 years (incidence rate; 11.0 per 100 person-years (95% CI 9.6 to 12.5)). The c-statistics of all selected bleeding risk models were low, around 0.56. Internal validation of an updated model accounting for death as competing risk showed a slightly improved c-statistic of 0.61 (95% CI 0.54 to 0.70). On updating, only age and a history of bleeding appeared to contribute to the prediction of bleeding risk.Conclusions Existing bleeding risk models cannot accurately differentiate bleeding risk between patients. Future studies may use our updated model as a starting point for further development of bleeding risk models in patients with cancer. Show less
Calster, B. van; Steyerberg, E.W.; Wynants, L.; Smeden, M. van 2023
Background Clinical prediction models should be validated before implementation in clinical practice. But is favorable performance at internal validation or one external validation sufficient to... Show moreBackground Clinical prediction models should be validated before implementation in clinical practice. But is favorable performance at internal validation or one external validation sufficient to claim that a prediction model works well in the intended clinical context? Main body We argue to the contrary because (1) patient populations vary, (2) measurement procedures vary, and (3) populations and measurements change over time. Hence, we have to expect heterogeneity in model performance between locations and settings, and across time. It follows that prediction models are never truly validated. This does not imply that validation is not important. Rather, the current focus on developing new models should shift to a focus on more extensive, well-conducted, and well-reported validation studies of promising models. Conclusion Principled validation strategies are needed to understand and quantify heterogeneity, monitor performance over time, and update prediction models when appropriate. Such strategies will help to ensure that prediction models stay up-to-date and safe to support clinical decision-making. Show less
McLernon, D.J.; Giardiello, D.; Calster, B. van; Wynants, L.; Geloven, N. van; Smeden, M. van; ... ; STRATOS Initiative 2022
Risk prediction models need thorough validation to assess their performance. Validation of models for survival outcomes poses challenges due to the censoring of observations and the varying time... Show moreRisk prediction models need thorough validation to assess their performance. Validation of models for survival outcomes poses challenges due to the censoring of observations and the varying time horizon at which predictions can be made. This article describes measures to evaluate predictions and the potential improvement in decision making from survival models based on Cox proportional hazards regression.As a motivating case study, the authors consider the prediction of the composite outcome of recurrence or death (the "event ") in patients with breast cancer after surgery. They developed a simple Cox regression model with 3 predictors, as in the Nottingham Prognostic Index, in 2982 women (1275 events over 5 years of follow-up) and externally validated this model in 686 women (285 events over 5 years). Improvement in performance was assessed after the addition of progesterone receptor as a prognostic biomarker.The model predictions can be evaluated across the full range of observed follow-up times or for the event occurring by the end of a fixed time horizon of interest. The authors first discuss recommended statistical measures that evaluate model performance in terms of discrimination, calibration, or overall performance. Further, they evaluate the potential clinical utility of the model to support clinical decision making according to a net benefit measure. They provide SAS and R code to illustrate internal and external validation.The authors recommend the proposed set of performance measures for transparent reporting of the validity of predictions from survival models. Show less
Geloven, N. van; Giardiello, D.; Bonneville, E.F.; Teece, L.; Ramspek, C.L.; Smeden, M. van; ... ; STRATOS Initiative 2022
A common view in epidemiology is that automated confounder selection methods, such as backward elimination, should be avoided as they can lead to biased effect estimates and underestimation of... Show moreA common view in epidemiology is that automated confounder selection methods, such as backward elimination, should be avoided as they can lead to biased effect estimates and underestimation of their variance. Nevertheless, backward elimination remains regularly applied. We investigated if and under which conditions causal effect estimation in observational studies can improve by using backward elimination on a prespecified set of potential confounders. An expression was derived that quantifies how variable omission relates to bias and variance of effect estimators. Additionally, 3960 scenarios were defined and investigated by simulations comparing bias and mean squared error (MSE) of the conditional log odds ratio, log(cOR), and the marginal log risk ratio, log(mRR), between full models including all prespecified covariates and backward elimination of these covariates. Applying backward elimination resulted in a mean bias of 0.03 for log(cOR) and 0.02 for log(mRR), compared to 0.56 and 0.52 for log(cOR) and log(mRR), respectively, for a model without any covariate adjustment, and no bias for the full model. In less than 3% of the scenarios considered, the MSE of the log(cOR) or log(mRR) was slightly lower (max 3%) when backward elimination was used compared to the full model. When an initial set of potential confounders can be specified based on background knowledge, there is minimal added value of backward elimination. We advise not to use it and otherwise to provide ample arguments supporting its use. Show less
Heinze, G.; Smeden, M. van; Wynants, L.; Steyerberg, E.; Calster, B. van 2022
While the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied... Show moreWhile the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied and disseminated in daily practice. This scoping review aimed to identify actionable guidance for those closely involved in AI-based prediction model (AIPM) development, evaluation and implementation including software engineers, data scientists, and healthcare professionals and to identify potential gaps in this guidance. We performed a scoping review of the relevant literature providing guidance or quality criteria regarding the development, evaluation, and implementation of AIPMs using a comprehensive multi-stage screening strategy. PubMed, Web of Science, and the ACM Digital Library were searched, and AI experts were consulted. Topics were extracted from the identified literature and summarized across the six phases at the core of this review: (1) data preparation, (2) AIPM development, (3) AIPM validation, (4) software development, (5) AIPM impact assessment, and (6) AIPM implementation into daily healthcare practice. From 2683 unique hits, 72 relevant guidance documents were identified. Substantial guidance was found for data preparation, AIPM development and AIPM validation (phases 1-3), while later phases clearly have received less attention (software development, impact assessment and implementation) in the scientific literature. The six phases of the AIPM development, evaluation and implementation cycle provide a framework for responsible introduction of AI-based prediction models in healthcare. Additional domain and technology specific research may be necessary and more practical experience with implementing AIPMs is needed to support further guidance. Show less
Boone, S.C.; Smeden, M. van; Rosendaal, F.R.; Cessie, S. le; Groenwold, R.H.H.; Jukema, J.W.; ... ; Mutsert, R. de 2022
Visceral adipose tissue (VAT) is a strong prognostic factor for cardiovascular disease and a potential target for cardiovascular risk stratification. Because VAT is difficult to measure in clinical... Show moreVisceral adipose tissue (VAT) is a strong prognostic factor for cardiovascular disease and a potential target for cardiovascular risk stratification. Because VAT is difficult to measure in clinical practice, we estimated prediction models with predictors routinely measured in general practice and VAT as outcome using ridge regression in 2,501 middle-aged participants from the Netherlands Epidemiology of Obesity study, 2008-2012. Adding waist circumference and other anthropometric measurements on top of the routinely measured variables improved the optimism-adjusted R-2 from 0.50 to 0.58 with a decrease in the root-mean-square error (RMSE) from 45.6 to 41.5 cm(2) and with overall good calibration. Further addition of predominantly lipoprotein-related metabolites from the Nightingale platform did not improve the optimism-corrected R-2 and RMSE. The models were externally validated in 370 participants from the Prospective Investigation of Vasculature in Uppsala Seniors (PIVUS, 2006-2009) and 1,901 participants from the Multi-Ethnic Study of Atherosclerosis (MESA, 2000-2007). Performance was comparable to the development setting in PIVUS (R-2 = 0.63, RMSE = 42.4 cm(2), calibration slope = 0.94) but lower in MESA (R-2 = 0.44, RMSE = 60.7 cm(2), calibration slope = 0.75). Our findings indicate that the estimation of VAT with routine clinical measurements can be substantially improved by incorporating waist circumference but not by metabolite measurements. Show less
Geloven, N. van; Giardiello, D.; Bonneville, E.F.; Teece, L.; Ramspek, C.L.; Smeden, M. van; ... ; Steyerberg, E. 2022
Thorough validation is pivotal for any prediction model before it can be advocated for use in medical practice. For time-to-event outcomes such as breast cancer recurrence, death from other causes... Show moreThorough validation is pivotal for any prediction model before it can be advocated for use in medical practice. For time-to-event outcomes such as breast cancer recurrence, death from other causes is a competing risk. Model performance measures must account for such competing events. In this article, we present a comprehensive yet accessible overview of performance measures for this competing eventsetting, including the calculation and interpretation of statistical measures for calibration, discrimination, overall prediction error, and clinical usefulness by decision curve analysis. All methods are illustrated for patients with breast cancer, with publicly available data and R code. Show less
Ramspek, C.L.; Teece, L.; Snell, K.I.E.; Evans, M.; Riley, R.D.; Smeden, M. van; ... ; Diepen, M. van 2021
Background: External validation of prognostic models is necessary to assess the accuracy and generalizability of the model to new patients. If models are validated in a setting in which competing... Show moreBackground: External validation of prognostic models is necessary to assess the accuracy and generalizability of the model to new patients. If models are validated in a setting in which competing events occur, these competing risks should be accounted for when comparing predicted risks to observed outcomes. Methods: We discuss existing measures of calibration and discrimination that incorporate competing events for time-to-event models. These methods are illustrated using a clinical-data example concerning the prediction of kidney failure in a population with advanced chronic kidney disease (CKD), using the guideline-recommended Kidney Failure Risk Equation (KFRE). The KFRE was developed using Cox regression in a diverse population of CKD patients and has been proposed for use in patients with advanced CKD in whom death is a frequent competing event. Results: When validating the 5-year KFRE with methods that account for competing events, it becomes apparent that the 5-year KFRE considerably overestimates the real-world risk of kidney failure. The absolute overestimation was 10%age points on average and 29%age points in older high-risk patients. Conclusions: It is crucial that competing events are accounted for during external validation to provide a more reliable assessment the performance of a model in clinical settings in which competing risks occur. Show less
Background: How diagnostic strategies for suspected pulmonary embolism (PE) perform in relevant patient subgroups defined by sex, age, cancer, and previous venous thromboembolism (VTE) is unknown.... Show moreBackground: How diagnostic strategies for suspected pulmonary embolism (PE) perform in relevant patient subgroups defined by sex, age, cancer, and previous venous thromboembolism (VTE) is unknown. Purpose: To evaluate the safety and efficiency of the Wells and revised Geneva scores combined with fixed and adapted D-dimer thresholds, as well as the YEARS algorithm, for ruling out acute PE in these subgroups. Data Sources: MEDLINE from 1 January 1995 until 1 January 2021. Study Selection: 16 studies assessing at least 1 diagnostic strategy. Data Extraction: Individual-patient data from 20553 patients. Data Synthesis: Safety was defined as the diagnostic failure rate (the predicted 3-month VTE incidence after exclusion of PE without imaging at baseline). Efficiency was defined as the proportion of individuals classified by the strategy as "PE con -sidered excluded" without imaging tests. Across all strategies, efficiency was highest in patients younger than 40 years (47% to 68%) and lowest in patients aged 80 years or older (6.0% to 23%) or patients with cancer (9.6% to 26%). However, efficiency improved considerably in these subgroups when pretest probabil-ity-dependent D-dimer thresholds were applied. Predicted failure rates were highest for strategies with adapted D-dimer thresh-olds, with failure rates varying between 2% and 4% in the pre-defined patient subgroups. Limitations: Between-study differences in scoring predictor items and D-dimer assays, as well as the presence of differential verifica-tion bias, in particular for classifying fatal events and subsegmental PE cases, all of which may have led to an overestimation of the predicted failure rates of adapted D-dimer thresholds. Conclusion: Overall, all strategies showed acceptable safety, with pretest probability-dependent D-dimer thresholds having not only the highest efficiency but also the highest predicted failure rate. From an efficiency perspective, this individual-patient data meta-analysis supports application of adapted D-dimer thresholds. Primary Funding Source: Dutch Research Council. (PROSPERO: CRD42018089366) Show less
Nab, L.; Smeden, M. van; Mutsert, R. de; Rosendaal, F.R.; Groenwold, R.H.H. 2021
Statistical correction for measurement error in epidemiologic studies is possible, provided that information about the measurement error model and its parameters are available. Such information is... Show moreStatistical correction for measurement error in epidemiologic studies is possible, provided that information about the measurement error model and its parameters are available. Such information is commonly obtained from a randomly sampled internal validation sample. It is however unknown whether randomly sampling the internal validation sample is the optimal sampling strategy. We conducted a simulation study to investigate various internal validation sampling strategies in conjunction with regression calibration. Our simulation study showed that for an internal validation study sample of 40% of the main study's sample size, stratified random and extremes sampling had a small efficiency gain over random sampling (10% and 12% decrease on average over all scenarios, respectively). The efficiency gain was more pronounced in smaller validation samples of 10% of the main study's sample size (i.e., a 31% and 36% decrease on average over all scenarios, for stratified random and extremes sampling, respectively). To mitigate the bias due to measurement error in epidemiologic studies, small efficiency gains can be achieved for internal validation sampling strategies other than random, but only when measurement error is nondifferential. For regression calibration, the gain in efficiency is, however, at the cost of a higher percentage bias and lower coverage. Show less
Nab, L.; Smeden, M. van; Keogh, R.H.; Groenwold, R.H.H. 2021
Measurement error in a covariate or the outcome of regression models is common, but is often ignored, even though measurement error can lead to substantial bias in the estimated covariate-outcome... Show moreMeasurement error in a covariate or the outcome of regression models is common, but is often ignored, even though measurement error can lead to substantial bias in the estimated covariate-outcome association. While several texts on measurement error correction methods are available, these methods remain seldomly applied. To improve the use of measurement error correction methodology, we developed mecor , an R package that implements measurement error correction methods for regression models with a continuous outcome. Measurement error correction requires information about the measurement error model and its parameters. This information can be obtained from four types of studies, used to estimate the parameters of the measurement error model: an internal validation study, a replicates study, a calibration study and an external validation study. In the package mecor , regression calibration methods and a maximum likelihood method are implemented to correct for measurement error in a continuous covariate in regression analyses. Additionally, methods of moments methods are implemented to correct for measurement error in the continuous outcome in regression analyses. Variance estimation of the corrected estimators is provided in closed form and using the bootstrap. (c) 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) Show less
Background Guidance reports for observational comparative effectiveness and drug safety research recommend implementing a new-user design whenever possible, since it reduces the risk of selection... Show moreBackground Guidance reports for observational comparative effectiveness and drug safety research recommend implementing a new-user design whenever possible, since it reduces the risk of selection bias in exposure effect estimation compared to a prevalent-user design. The uptake of this guidance has not been studied extensively.Methods We reviewed 89 observational effectiveness and safety cohort studies published in six pharmacoepidemiological journals in 2018 and 2019. We developed an extraction tool to assess how frequently new-user and prevalent-user designs were reported to be implemented. For studies that implemented a new-user design in both treatment arms, we extracted information about the extent to which the moment of meeting eligibility criteria, treatment initiation, and start of follow-up were reported to be aligned.Results Of the 89 studies included, 40% reported implementing a new-user design for both the study exposure arm and the comparator arm, while 13% reported implementing a prevalent-user design in both arms. The moment of meeting eligibility criteria, treatment initiation, and start of follow-up were reported to be aligned in both treatment arms in 53% of studies that reported implementing a new-user design. We provided examples of studies that minimized the risk of introducing bias due to unclear definition of time origin in unexposed participants, immortal time, or a time lag.Conclusions Almost half of the included studies reported implementing a new-user design. Implications of misalignment of study design origin were difficult to assess because it would require explicit reporting of the target estimand in original studies. We recommend that the choice for a particular study time origin is explicitly motivated to enable assessment of validity of the study. Show less