Objectives: To systematically evaluate the performance of COVID-19 prognostic models and scores for mortality risk in older populations across three health-care settings: hospitals, primary care,... Show moreObjectives: To systematically evaluate the performance of COVID-19 prognostic models and scores for mortality risk in older populations across three health-care settings: hospitals, primary care, and nursing homes.Study Design and Setting: This retrospective external validation study included 14,092 older individuals of >=70 years of age with a clinical or polymerase chain reaction-confirmed COVID-19 diagnosis from March 2020 to December 2020. The six validation cohorts include three hospital-based (CliniCo, COVID-OLD, COVID-PREDICT), two primary care-based (Julius General Practitioners Network/Academisch network huisartsgeneeskunde/Network of Academic general Practitioners, PHARMO), and one nursing home cohort (YSIS) in the Netherlands. Based on a living systematic review of COVID-19 prediction models using Prediction model Risk Of Bias ASsessment Tool for quality and risk of bias assessment and considering predictor availability in validation cohorts, we selected six prognostic models predicting mortality risk in adults with COVID-19 infection (GAL-COVID-19 mortality, 4C Mortality Score, National Early Warning Score 2-extended model, Xie model, Wang clinical model, and CURB65 score). All six prognostic models were validated in the hospital cohorts and the GAL-COVID-19 mortality model was validated in all three healthcare settings. The primary outcome was in-hospital mortality for hospitals and 28-day mortality for primary care and nursing home settings. Model performance was evaluated in each validation cohort separately in terms of discrimination, calibration, and decision curves. An intercept update was performed in models indicating miscalibration followed by predictive performance re-evaluation. Main Outcome Measure: In-hospital mortality for hospitals and 28-day mortality for primary care and nursing home setting. Results: All six prognostic models performed poorly and showed miscalibration in the older population cohorts. In the hospital settings, model performance ranged from calibration-in-the-large 1.45 to 7.46, calibration slopes 0.24e0.81, and C-statistic 0.55e0.71 with 4C Mortality Score performing as the most discriminative and well-calibrated model. Performance across health-care settings was similar for the GAL-COVID-19 model, with a calibration-in-the-large in the range of 2.35 to 0.15 indicating overestimation, calibration slopes of 0.24e0.81 indicating signs of overfitting, and C-statistic of 0.55e0.71. Conclusion: Our results show that most prognostic models for predicting mortality risk performed poorly in the older population with COVID-19, in each health-care setting: hospital, primary care, and nursing home settings. Insights into factors influencing predictive model performance in the older population are needed for pandemic preparedness and reliable prognostication of health-related outcomes in this demographic Show less
Geersing, G.J.; Takada, T.; Klok, F.A.; Büller, H.R.; Courtney, D.M.; Freund, Y.; ... ; Es, N. van 2024
BackgroundIn patients clinically suspected of having pulmonary embolism (PE), physicians often rely on intuitive estimation (“gestalt”) of PE presence. Although shown to be predictive, gestalt is... Show moreBackgroundIn patients clinically suspected of having pulmonary embolism (PE), physicians often rely on intuitive estimation (“gestalt”) of PE presence. Although shown to be predictive, gestalt is criticized for its assumed variation across physicians and lack of standardization.ObjectivesTo assess the diagnostic accuracy of gestalt in the diagnosis of PE and gain insight into its possible variation.MethodsWe performed an individual patient data meta-analysis including patients suspected of having PE. The primary outcome was diagnostic accuracy of gestalt for the diagnosis of PE, quantified as risk ratio (RR) between gestalt and PE based on 2-stage random-effect log-binomial meta-analysis regression as well as gestalts’ sensitivity and specificity. The variability of these measures was explored across different health care settings, publication period, PE prevalence, patient subgroups (sex, heart failure, chronic lung disease, and items of the Wells score other than gestalt), and age.ResultsWe analyzed 20 770 patients suspected of having PE from 16 original studies. The prevalence of PE in patients with and without a positive gestalt was 28.8% vs 9.1%, respectively. The overall RR was 3.02 (95% CI, 2.35-3.87), and the overall sensitivity and specificity were 74% (95% CI, 68%-79%) and 61% (95% CI, 53%-68%), respectively. Although variation was observed across individual studies (I2, 90.63%), the diagnostic accuracy was consistent across all subgroups and health care settings.ConclusionA positive gestalt was associated with a 3-fold increased risk of PE in suspected patients. Although variation was observed across studies, the RR of gestalt was similar across prespecified subgroups and health care settings, exemplifying its diagnostic value for all patients suspected of having PE. Show less
AimsRisk stratification is used for decisions regarding need for imaging in patients with clinically suspected acute pulmonary embolism (PE). The aim was to develop a clinical prediction model that... Show moreAimsRisk stratification is used for decisions regarding need for imaging in patients with clinically suspected acute pulmonary embolism (PE). The aim was to develop a clinical prediction model that provides an individualized, accurate probability estimate for the presence of acute PE in patients with suspected disease based on readily available clinical items and D-dimer concentrations.Methods and resultsAn individual patient data meta-analysis was performed based on sixteen cross-sectional or prospective studies with data from 28 305 adult patients with clinically suspected PE from various clinical settings, including primary care, emergency care, hospitalized and nursing home patients. A multilevel logistic regression model was built and validated including ten a priori defined objective candidate predictors to predict objectively confirmed PE at baseline or venous thromboembolism (VTE) during follow-up of 30 to 90 days. Multiple imputation was used for missing data. Backward elimination was performed with a P-value <0.10. Discrimination (c-statistic with 95% confidence intervals [CI] and prediction intervals [PI]) and calibration (outcome:expected [O:E] ratio and calibration plot) were evaluated based on internal-external cross-validation. The accuracy of the model was subsequently compared with algorithms based on the Wells score and D-dimer testing. The final model included age (in years), sex, previous VTE, recent surgery or immobilization, haemoptysis, cancer, clinical signs of deep vein thrombosis, inpatient status, D-dimer (in µg/L), and an interaction term between age and D-dimer. The pooled c-statistic was 0.87 (95% CI, 0.85–0.89; 95% PI, 0.77–0.93) and overall calibration was very good (pooled O:E ratio, 0.99; 95% CI, 0.87–1.14; 95% PI, 0.55–1.79). The model slightly overestimated VTE probability in the lower range of estimated probabilities. Discrimination of the current model in the validation data sets was better than that of the Wells score combined with a D-dimer threshold based on age (c-statistic 0.73; 95% CI, 0.70–0.75) or structured clinical pretest probability (c-statistic 0.79; 95% CI, 0.76–0.81).ConclusionThe present model provides an absolute, individualized probability of PE presence in a broad population of patients with suspected PE, with very good discrimination and calibration. Its clinical utility needs to be evaluated in a prospective management or impact study. Show less
While the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied... Show moreWhile the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied and disseminated in daily practice. This scoping review aimed to identify actionable guidance for those closely involved in AI-based prediction model (AIPM) development, evaluation and implementation including software engineers, data scientists, and healthcare professionals and to identify potential gaps in this guidance. We performed a scoping review of the relevant literature providing guidance or quality criteria regarding the development, evaluation, and implementation of AIPMs using a comprehensive multi-stage screening strategy. PubMed, Web of Science, and the ACM Digital Library were searched, and AI experts were consulted. Topics were extracted from the identified literature and summarized across the six phases at the core of this review: (1) data preparation, (2) AIPM development, (3) AIPM validation, (4) software development, (5) AIPM impact assessment, and (6) AIPM implementation into daily healthcare practice. From 2683 unique hits, 72 relevant guidance documents were identified. Substantial guidance was found for data preparation, AIPM development and AIPM validation (phases 1-3), while later phases clearly have received less attention (software development, impact assessment and implementation) in the scientific literature. The six phases of the AIPM development, evaluation and implementation cycle provide a framework for responsible introduction of AI-based prediction models in healthcare. Additional domain and technology specific research may be necessary and more practical experience with implementing AIPMs is needed to support further guidance. Show less
Background: How diagnostic strategies for suspected pulmonary embolism (PE) perform in relevant patient subgroups defined by sex, age, cancer, and previous venous thromboembolism (VTE) is unknown.... Show moreBackground: How diagnostic strategies for suspected pulmonary embolism (PE) perform in relevant patient subgroups defined by sex, age, cancer, and previous venous thromboembolism (VTE) is unknown. Purpose: To evaluate the safety and efficiency of the Wells and revised Geneva scores combined with fixed and adapted D-dimer thresholds, as well as the YEARS algorithm, for ruling out acute PE in these subgroups. Data Sources: MEDLINE from 1 January 1995 until 1 January 2021. Study Selection: 16 studies assessing at least 1 diagnostic strategy. Data Extraction: Individual-patient data from 20553 patients. Data Synthesis: Safety was defined as the diagnostic failure rate (the predicted 3-month VTE incidence after exclusion of PE without imaging at baseline). Efficiency was defined as the proportion of individuals classified by the strategy as "PE con -sidered excluded" without imaging tests. Across all strategies, efficiency was highest in patients younger than 40 years (47% to 68%) and lowest in patients aged 80 years or older (6.0% to 23%) or patients with cancer (9.6% to 26%). However, efficiency improved considerably in these subgroups when pretest probabil-ity-dependent D-dimer thresholds were applied. Predicted failure rates were highest for strategies with adapted D-dimer thresh-olds, with failure rates varying between 2% and 4% in the pre-defined patient subgroups. Limitations: Between-study differences in scoring predictor items and D-dimer assays, as well as the presence of differential verifica-tion bias, in particular for classifying fatal events and subsegmental PE cases, all of which may have led to an overestimation of the predicted failure rates of adapted D-dimer thresholds. Conclusion: Overall, all strategies showed acceptable safety, with pretest probability-dependent D-dimer thresholds having not only the highest efficiency but also the highest predicted failure rate. From an efficiency perspective, this individual-patient data meta-analysis supports application of adapted D-dimer thresholds. Primary Funding Source: Dutch Research Council. (PROSPERO: CRD42018089366) Show less
Slieker, R.C.; Heijden, A.A.W.A. van der; Siddiqui, M.K.; Langendoen-Gort, M.; Nijpels, G.; Herings, R.; ... ; Beulens, J.W.J. 2021
OBJECTIVES To identify and assess the quality and accuracy of prognostic models for nephropathy and to validate these models in external cohorts of people with type 2 diabetes. DESIGN Systematic... Show moreOBJECTIVES To identify and assess the quality and accuracy of prognostic models for nephropathy and to validate these models in external cohorts of people with type 2 diabetes. DESIGN Systematic review and external validation. DATA SOURCES PubMed and Embase. ELIGIBILITY CRITERIA Studies describing the development of a model to predict the risk of nephropathy, applicable to people with type 2 diabetes. METHODS Screening, data extraction, and risk of bias assessment were done in duplicate. Eligible models were externally validated in the Hoorn Diabetes Care System (DCS) cohort (n=11 450) for the same outcomes for which they were developed. Risks of nephropathy were calculated and compared with observed risk over 2, 5, and 10 years of follow-up. Model performance was assessed based on intercept adjusted calibration and discrimination (Harrell's C statistic). RESULTS 41 studies included in the systematic review reported 64 models, 46 of which were developed in a population with diabetes and 18 in the general population including diabetes as a predictor. The predicted outcomes included albuminuria, diabetic kidney disease, chronic kidney disease (general population), and end stage renal disease. The reported apparent discrimination of the 46 models varied considerably across the different predicted outcomes, from 0.60 (95% confidence interval 0.56 to 0.64) to 0.99 (not available) for the models developed in a diabetes population and from 0.59 (not available) to 0.96 (0.95 to 0.97) for the models developed in the general population. Calibration was reported in 31 of the 41 studies, and the models were generally well calibrated. 21 of the 64 retrieved models were externally validated in the Hoorn DCS cohort for predicting risk of albuminuria, diabetic kidney disease, and chronic kidney disease, with considerable variation in performance across prediction horizons and models. For all three outcomes, however, at least two models had C statistics >0.8, indicating excellent discrimination. In a secondary external validation in GoDARTS (Genetics of Diabetes Audit and Research in Tayside Scotland), models developed for diabetic kidney disease outperformed those for chronic kidney disease. Models were generally well calibrated across all three prediction horizons. CONCLUSIONS This study identified multiple prediction models to predict albuminuria, diabetic kidney disease, chronic kidney disease, and end stage renal disease. In the external validation, discrimination and calibration for albuminuria, diabetic kidney disease, and chronic kidney disease varied considerably across prediction horizons and models. For each outcome, however, specific models showed good discrimination and calibration across the three prediction horizons, with clinically accessible predictors, making them applicable in a clinical setting. SYSTEMATIC REVIEW REGISTRATION PROSPERO CRD42020192831. Show less
Goede, J. de; Mark-Reeuwijk, K.G. van der; Braun, K.P.; Cessie, S. le; Durston, S.; Engels, R.C.M.E.; ... ; Oosterlaan, J. 2021
Young people, whose brains are still developing, might entail a greater vulnerability to the effects of alcohol consumption on brain function and development. A committee of experts of the Health... Show moreYoung people, whose brains are still developing, might entail a greater vulnerability to the effects of alcohol consumption on brain function and development. A committee of experts of the Health Council of the Netherlands evaluated the state of scientific knowledge regarding the question whether alcohol negatively influences brain development in young people. A systematic literature search for prospective studies was performed in PubMed and PsychINFO, for longitudinal studies of adolescents or young adults ranging between 12 and 24 y of age at baseline, investigating the relation between alcohol use and outcome measures of brain structure and activity, cognitive functioning, educational achievement, or alcohol use disorder (AUD), with measures at baseline and follow-up of the outcome of interest. Data were extracted from original articles and study quality was assessed using the Newcastle-Ottawa Scale. A total of 77 studies were included, 31 of which were of sufficient quality in relation to the study objectives. There were indications that the gray matter of the brain develops abnormally in young people who drink alcohol. In addition, the more often young people drink or the younger they start, the higher the risk of developing AUD later in life. The evidence on white matter volume or quality, brain activity, cognitive function, and educational achievement is still limited or unclear. The committee found indications that alcohol consumption can have a negative effect on brain development in adolescents and young adults and entails a risk of later AUD. The committee therefore considers it a wise choice for adolescents and young adults not to drink alcohol. Show less
Aims The aim of this study was to develop, validate, and illustrate an updated prediction model (SCORE2) to estimate 10-year fatal and non-fatal cardiovascular disease (CVD) risk in individuals... Show moreAims The aim of this study was to develop, validate, and illustrate an updated prediction model (SCORE2) to estimate 10-year fatal and non-fatal cardiovascular disease (CVD) risk in individuals without previous CVD or diabetes aged 40-69 years in Europe.Methods and results We derived risk prediction models using individual-participant data from 45 cohorts in 13 countries (677 684 individuals, 30 121 CVD events). We used sex-specific and competing risk-adjusted models, including age, smoking status, systolic blood pressure, and total- and HDL-cholesterol. We defined four risk regions in Europe according to country-specific CVD mortality, recalibrating models to each region using expected incidences and risk factor distributions. Region-specific incidence was estimated using CVD mortality and incidence data on 10 776 466 individuals. For external validation, we analysed data from 25 additional cohorts in 15 European countries (1 133 181 individuals, 43 492 CVD events). After applying the derived risk prediction models to external validation cohorts, C-indices ranged from 0.67 (0.65-0.68) to 0.81 (0.76-0.86). Predicted CVD risk varied several-fold across European regions. For example, the estimated 10-year CVD risk for a 50-year-old smoker, with a systolic blood pressure of 140 mmHg, total cholesterol of 5.5 mmol/L, and HDL-cholesterol of 1.3 mmol/L, ranged from 5.9% for men in low- risk countries to 14.0% for men in very high-risk countries, and from 4.2% for women in low-risk countries to 13.7% for women in very high-risk countries.Conclusion SCORE2-a new algorithm derived, calibrated, and validated to predict 10-year risk of first-onset CVD in European populations-enhances the identification of individuals at higher risk of developing CVD across Europe. Show less
Beulens, J.W.J.; Yauw, J.S.; Elders, P.J.M.; Feenstra, T.; Herings, R.; Slieker, R.C.; ... ; Heijden, A.A. van der 2021
Aims/hypothesis Approximately 25% of people with type 2 diabetes experience a foot ulcer and their risk of amputation is 10-20 times higher than that of people without type 2 diabetes. Prognostic... Show moreAims/hypothesis Approximately 25% of people with type 2 diabetes experience a foot ulcer and their risk of amputation is 10-20 times higher than that of people without type 2 diabetes. Prognostic models can aid in targeted monitoring but an overview of their performance is lacking. This study aimed to systematically review prognostic models for the risk of foot ulcer or amputation and quantify their predictive performance in an independent cohort.Methods A systematic review identified studies developing prognostic models for foot ulcer or amputation over minimal 1 year follow-up applicable to people with type 2 diabetes. After data extraction and risk of bias assessment (both in duplicate), selected models were externally validated in a prospective cohort with a 5 year follow-up in terms of discrimination (C statistics) and calibration (calibration plots).Results We identified 21 studies with 34 models predicting polyneuropathy, foot ulcer or amputation. Eleven models were validated in 7624 participants, of whom 485 developed an ulcer and 70 underwent amputation. The models for foot ulcer showed C statistics (95% CI) ranging from 0.54 (0.54, 0.54) to 0.81 (0.75, 0.86) and models for amputation showed C statistics (95% CI) ranging from 0.63 (0.55, 0.71) to 0.86 (0.78, 0.94). Most models underestimated the ulcer or amputation risk in the highest risk quintiles. Three models performed well to predict a combined endpoint of amputation and foot ulcer (C statistics >0.75).Conclusions/interpretation Thirty-four prognostic models for the risk of foot ulcer or amputation were identified. Although the performance of the models varied considerably, three models performed well to predict foot ulcer or amputation and may be applicable to clinical practice. Show less
Wynants, L.; Calster, B. van; Bonten, M.M.J.; Collins, G.S.; Debray, T.P.A.; Vos, M. de; ... ; Smeden, M. van 2020
OBJECTIVETo review and critically appraise published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis... Show moreOBJECTIVETo review and critically appraise published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at risk of being admitted to hospital for covid-19 pneumonia.DESIGNRapid systematic review and critical appraisal.DATA SOURCESPubMed and Embase through Ovid, Arxiv, medRxiv, and bioRxiv up to 24 March 2020.STUDY SELECTIONStudies that developed or validated a multivariable covid-19 related prediction model.DATA EXTRACTIONAt least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool).RESULTS2696 titles were screened, and 27 studies describing 31 prediction models were included. Three models were identified for predicting hospital admission from pneumonia and other events (as proxy outcomes for covid-19 pneumonia) in the general population; 18 diagnostic models for detecting covid-19 infection (13 were machine learning based on computed tomography scans); and 10 prognostic models for predicting mortality risk, progression to severe disease, or length of hospital stay. Only one study used patient data from outside of China. The most reported predictors of presence of covid-19 in patients with suspected disease included age, body temperature, and signs and symptoms. The most reported predictors of severe prognosis in patients with covid-19 included age, sex, features derived from computed tomography scans, C reactive protein, lactic dehydrogenase, and lymphocyte count. C index estimates ranged from 0.73 to 0.81 in prediction models for the general population (reported for all three models), from 0.81 to more than 0.99 in diagnostic models (reported for 13 of the 18 models), and from 0.85 to 0.98 in prognostic models (reported for six of the 10 models). All studies were rated at high risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, and high risk of model overfitting. Reporting quality varied substantially between studies. Most reports did not include a description of the study population or intended use of the models, and calibration of predictions was rarely assessed.CONCLUSIONPrediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that proposed models are poorly reported, at high risk of bias, and their reported performance is probably optimistic. Immediate sharing of well documented individual participant data from covid-19 studies is needed for collaborative efforts to develop more rigorous prediction models and validate existing ones. The predictors identified in included studies could be considered as candidate predictors for new models. Methodological guidance should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, studies should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Show less
Aims/hypothesis The aims of this study were to identify all published prognostic models predicting retinopathy risk applicable to people with type 2 diabetes, to assess their quality and accuracy,... Show moreAims/hypothesis The aims of this study were to identify all published prognostic models predicting retinopathy risk applicable to people with type 2 diabetes, to assess their quality and accuracy, and to validate their predictive accuracy in a head-to-head comparison using an independent type 2 diabetes cohort. Methods A systematic search was performed in PubMed and Embase in December 2019. Studies that met the following criteria were included: (1) the model was applicable in type 2 diabetes; (2) the outcome was retinopathy; and (3) follow-up was more than 1 year. Screening, data extraction (using the checklist for critical appraisal and data extraction for systemic reviews of prediction modelling studies [CHARMS]) and risk of bias assessment (by prediction model risk of bias assessment tool [PROBAST]) were performed independently by two reviewers. Selected models were externally validated in the large Hoorn Diabetes Care System (DCS) cohort in the Netherlands. Retinopathy risk was calculated using baseline data and compared with retinopathy incidence over 5 years. Calibration after intercept adjustment and discrimination (Harrell's C statistic) were assessed. Results Twelve studies were included in the systematic review, reporting on 16 models. Outcomes ranged from referable retinopathy to blindness. Discrimination was reported in seven studies with C statistics ranging from 0.55 (95% CI 0.54, 0.56) to 0.84 (95% CI 0.78, 0.88). Five studies reported on calibration. Eight models could be compared head-to-head in the DCS cohort (N = 10,715). Most of the models underestimated retinopathy risk. Validating the models against different severities of retinopathy, C statistics ranged from 0.51 (95% CI 0.49, 0.53) to 0.89 (95% CI 0.88, 0.91). Conclusions/interpretation Several prognostic models can accurately predict retinopathy risk in a population-based type 2 diabetes cohort. Most of the models include easy-to-measure predictors enhancing their applicability. Tailoring retinopathy screening frequency based on accurate risk predictions may increase the efficiency and cost-effectiveness of diabetic retinopathy care. Registration PROSPERO registration ID CRD42018089122 Show less
BackgroundPreeclampsia is a female-specific risk factor for the development of future cardiovascular disease. Whether early preventive cardiovascular disease risk screenings combined with risk... Show moreBackgroundPreeclampsia is a female-specific risk factor for the development of future cardiovascular disease. Whether early preventive cardiovascular disease risk screenings combined with risk-based lifestyle interventions in women with previous preeclampsia are beneficial and cost-effective is unknown.MethodsA micro-simulation model was developed to assess the life-long impact of preventive cardiovascular screening strategies initiated after women experienced preeclampsia during pregnancy. Screening was started at the age of 30 or 40 years and repeated every five years. Data (initial and follow-up) from women with a history of preeclampsia was used to calculate 10-year cardiovascular disease risk estimates according to Framingham Risk Score. An absolute risk threshold of 2% was evaluated for treatment selection, i.e. lifestyle interventions (e.g. increasing physical activity). Screening benefits were assessed in terms of costs and quality-adjusted-life-years, and incremental cost-effectiveness ratios compared with no screening.ResultsExpected health outcomes for no screening are 27.35 quality-adjusted-life-years and increase to 27.43 quality-adjusted-life-years (screening at 30 years with 2% threshold). The expected costs for no screening are euro9426 and around euro13,881 for screening at 30 years (for a 2% threshold). Preventive screening at 40 years with a 2% threshold has the most favourable incremental cost-effectiveness ratio, i.e. euro34,996/quality-adjusted-life-year, compared with other screening scenarios and no screening.ConclusionsEarly cardiovascular disease risk screening followed by risk-based lifestyle interventions may lead to small long-term health benefits in women with a history of preeclampsia. However, the cost-effectiveness of a lifelong cardiovascular prevention programme starting early after preeclampsia with risk-based lifestyle advice alone is relatively unfavourable. A combination of risk-based lifestyle advice plus medical therapy may be more beneficial. Show less
Najafabadi, A.H.Z.; Ramspek, C.L.; Dekker, F.W.; Heus, P.; Hooft, L.; Moons, K.G.M.; ... ; Diepen, M. van 2020
Objectives To assess the difference in completeness of reporting and methodological conduct of published prediction models before and after publication of the Transparent Reporting of a... Show moreObjectives To assess the difference in completeness of reporting and methodological conduct of published prediction models before and after publication of the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. Methods In the seven general medicine journals with the highest impact factor, we compared the completeness of the reporting and the quality of the methodology of prediction model studies published between 2012 and 2014 (pre-TRIPOD) with studies published between 2016 and 2017 (post-TRIPOD). For articles published in the post-TRIPOD period, we examined whether there was improved reporting for articles (1) citing the TRIPOD statement, and (2) published in journals that published the TRIPOD statement. Results A total of 70 articles was included (pre-TRIPOD: 32, post-TRIPOD: 38). No improvement was seen for the overall percentage of reported items after the publication of the TRIPOD statement (pre-TRIPOD 74%, post-TRIPOD 76%, 95% CI of absolute difference: -4% to 7%). For the individual TRIPOD items, an improvement was seen for 16 (44%) items, while 3 (8%) items showed no improvement and 17 (47%) items showed a deterioration. Post-TRIPOD, there was no improved reporting for articles citing the TRIPOD statement, nor for articles published in journals that published the TRIPOD statement. The methodological quality improved in the post-TRIPOD period. More models were externally validated in the same article (absolute difference 8%, post-TRIPOD: 39%), used measures of calibration (21%, post-TRIPOD: 87%) and discrimination (9%, post-TRIPOD: 100%), and used multiple imputation for handling missing data (12%, post-TRIPOD: 50%). Conclusions Since the publication of the TRIPOD statement, some reporting and methodological aspects have improved. Prediction models are still often poorly developed and validated and many aspects remain poorly reported, hindering optimal clinical application of these models. Long-term effects of the TRIPOD statement publication should be evaluated in future studies. Show less
Background To help adapt cardiovascular disease risk prediction approaches to low-income and middle-income countries, WHO has convened an effort to develop, evaluate, and illustrate revised risk... Show moreBackground To help adapt cardiovascular disease risk prediction approaches to low-income and middle-income countries, WHO has convened an effort to develop, evaluate, and illustrate revised risk models. Here, we report the derivation, validation, and illustration of the revised WHO cardiovascular disease risk prediction charts that have been adapted to the circumstances of 21 global regions.Methods In this model revision initiative, we derived 10-year risk prediction models for fatal and non-fatal cardiovascular disease (ie, myocardial infarction and stroke) using individual participant data from the Emerging Risk Factors Collaboration. Models included information on age, smoking status, systolic blood pressure, history of diabetes, and total cholesterol. For derivation, we included participants aged 40-80 years without a known baseline history of cardiovascular disease, who were followed up until the first myocardial infarction, fatal coronary heart disease, or stroke event. We recalibrated models using age-specific and sex-specific incidences and risk factor values available from 21 global regions. For external validation, we analysed individual participant data from studies distinct from those used in model derivation. We illustrated models by analysing data on a further 123 743 individuals from surveys in 79 countries collected with the WHO STEPwise Approach to Surveillance.Findings Our risk model derivation involved 376 177 individuals from 85 cohorts, and 19 333 incident cardiovascular events recorded during 10 years of follow-up. The derived risk prediction models discriminated well in external validation cohorts (19 cohorts, 1 096 061 individuals, 25 950 cardiovascular disease events), with Harrell's C indices ranging from 0.685 (95% CI 0 . 629-0 741) to 0.833 (0 . 783-0- 882). For a given risk factor profile, we found substantial variation across global regions in the estimated 10-year predicted risk. For example, estimated cardiovascular disease risk for a 60-year-old male smoker without diabetes and with systolic blood pressure of 140 mm Hg and total cholesterol of 5 mmol/L ranged from 11% in Andean Latin America to 30% in central Asia. When applied to data from 79 countries (mostly low-income and middle-income countries), the proportion of individuals aged 40-64 years estimated to be at greater than 20% risk ranged from less than 1% in Uganda to more than 16% in Egypt.Interpretation We have derived, calibrated, and validated new WHO risk prediction models to estimate cardiovascular disease risk in 21 Global Burden of Disease regions. The widespread use of these models could enhance the accuracy, practicability, and sustainability of efforts to reduce the burden of cardiovascular disease worldwide. Copyright (C) 2019 The Author(s). Published by Elsevier Ltd. Show less
Introduction Combined with patient history and physical examination, a negative D-dimer can safely rule-out pulmonary embolism (PE). However, the D-dimer test is frequently false positive, leading... Show moreIntroduction Combined with patient history and physical examination, a negative D-dimer can safely rule-out pulmonary embolism (PE). However, the D-dimer test is frequently false positive, leading to many (with hindsight) 'unneeded' referrals to secondary care. Recently, the novel YEARS algorithm, incorporating flexible D-dimer thresholds depending on pretest risk, was developed and validated, showing its ability to safely exclude PE in the hospital environment. Importantly, this was accompanied with 14% fewer computed tomographic pulmonary angiography than the standard, fixed D-dimer threshold. Although promising, in primary care this algorithm has not been validated yet.Methods and analysis The PECAN (Diagnosing Pulmonary Embolism in the context of Common Alternative diagNoses in primary care) study is a prospective diagnostic study performed in Dutch primary care. Included patients with suspected acute PE will be managed by their general practitioner according to the YEARS diagnostic algorithm and followed up in primary care for 3 months to establish the final diagnosis. To study the impact of the use of the YEARS algorithm, the primary endpoints are the safety and efficiency of the YEARS algorithm in primary care. Safety is defined as the proportion of false-negative test results in those not referred. Efficiency denotes the proportion of patients classified in this non-referred category. Additionally, we quantify whether C reactive protein measurement has added diagnostic value to the YEARS algorithm, using multivariable logistic and polytomous regression modelling. Furthermore, we will investigate which factors contribute to the subjective YEARS item 'PE most likely diagnosis'. Show less
Jenniskens, K.; Naaktgeboren, C.A.; Reitsma, J.B.; Hooft, L.; Moons, K.G.M.; Smeden, M. van 2019
Objectives: The objective of this study was to study the impact of ignoring uncertainty by forcing dichotomous classification (presence or absence) of the target disease on estimates of diagnostic... Show moreObjectives: The objective of this study was to study the impact of ignoring uncertainty by forcing dichotomous classification (presence or absence) of the target disease on estimates of diagnostic accuracy of an index test.Study Design and Setting: We evaluated the bias in estimated index test accuracy when forcing an expert panel to make a dichotomous target disease classification for each individual. Data for various scenarios with expert panels were simulated by varying the number and accuracy of "component reference tests" available to the expert panel, index test sensitivity and specificity, and target disease prevalence.Results: Index test accuracy estimates are likely to be biased when there is uncertainty surrounding the presence or absence of the target disease. Direction and amount of bias depend on the number and accuracy of component reference tests, target disease prevalence, and the true values of index test sensitivity and specificity.Conclusion: In this simulation, forcing expert panels to make a dichotomous decision on target disease classification in the presence of uncertainty leads to biased estimates of index test accuracy. Empirical studies are needed to demonstrate whether this bias can be reduced by assigning a probability of target disease presence for each individual, or using advanced statistical methods to account for uncertainty in target disease classification. (C) 2019 Elsevier Inc. All rights reserved. Show less
Jong, V.M.T. de; Eijkemans, M.J.C.; Calster, B. van; Timmerman, D.; Moons, K.G.M.; Steyerberg, E.W.; Smeden, M. van 2019