Purpose: Effects of clockwise torque rotation onto proximal femoral fracture fixation have been subject of ongoing debate: fixated right-sided trochanteric fractures seem more rotationally stable... Show morePurpose: Effects of clockwise torque rotation onto proximal femoral fracture fixation have been subject of ongoing debate: fixated right-sided trochanteric fractures seem more rotationally stable than left-sided fractures in the biomechanical setting, but this theoretical advantage has not been demonstrated in the clinical setting to date. The purpose of this study was to identify a difference in early reoperation rate between patients undergoing surgery for left-versus right-sided proximal femur fractures using cephalomedullary nailing (CMN). Materials and methods: The American College of Surgeons National Surgical Quality Improvement Program was queried from 2016-2019 to identify patients aged 50 years and older undergoing CMN for a proximal femoral fracture. The primary outcome was any unplanned reoperation within 30 days following surgery. The difference was calculated using a Chi-square test, and observed power calculated using post-hoc power analysis. Results: In total, of 20,122 patients undergoing CMN for proximal femoral fracture management, 1.8% (n=371) had to undergo an unplanned reoperation within 30 days after surgery. Overall, 208 (2.0%) were left-sided and 163 (1.7%) right-sided fractures (p=0.052, risk ratio [RR] 1.22, 95% confidence interval [CI] 1.00-1.50), odds ratio [OR] 1.23 (95%CI 1.00-1.51), power 49.2% (& alpha;=0.05). Conclusion: This study shows a higher risk of reoperation for left-sided compared to right-sided proximal femur fractures after CMN in a large sample size. Although results may be underpowered and statistically insignificant, this finding might substantiate the hypothesis that clockwise rotation during implant insertion and (post-operative) weightbearing may lead to higher reoperation rates. Level of evidence: Therapeutic level II. Show less
Background: It is well documented that routinely collected patient sociodemographic characteristics (such as race and insurance type) and geography-based social determinants of health (SDoH)... Show moreBackground: It is well documented that routinely collected patient sociodemographic characteristics (such as race and insurance type) and geography-based social determinants of health (SDoH) measures (for example, the Area Deprivation Index) are associated with health disparities, including symptom severity at presentation. However, the association of patient-level SDoH factors (such as housing status) on musculoskeletal health disparities is not as well documented. Such insight might help with the development of more-targeted interventions to help address health disparities in orthopaedic surgery. Questions/purposes: (1) What percentage of patients presenting for new patient visits in an orthopaedic surgery clinic who were unemployed but seeking work reported transportation issues that could limit their ability to attend a medical appointment or acquire medications, reported trouble paying for medications, and/or had no current housing? (2) Accounting for traditional sociodemographic factors and patient-level SDoH measures, what factors are associated with poorer patient-reported outcome physical health scores at presentation? (3) Accounting for traditional sociodemographic factor patient-level SDoH measures, what factors are associated with poorer patient-reported outcome mental health scores at presentation? Methods: New patient encounters at one Level 1 trauma center clinic visit from March 2018 to December 2020 were identified. Included patients had to meet two criteria: they had completed the Patient-Reported Outcome Measure Information System (PROMIS) Global-10 at their new orthopaedic surgery clinic encounter as part of routine clinical care, and they had visited their primary care physician and completed a series of specific SDoH questions. The SDoH questionnaire was developed in our institution to improve data that drive interventions to address health disparities as part of our accountable care organization work. Over the study period, the SDoH questionnaire was only distributed at primary care provider visits. The SDoH questions focused on transportation, housing, employment, and ability to pay for medications. Because we do not have a way to determine how many patients had both primary care provider office visits and new orthopaedic surgery clinic visits over the study period, we were unable to determine how many patients could have been included; however, 9057 patients were evaluated in this cross-sectional study. The mean age was 61 +/- 15 years, and most patients self-reported being of White race (83% [7561 of 9057]). Approximately half the patient sample had commercial insurance (46% [4167 of 9057]). To get a better sense of how this study cohort compared with the overall patient population seen at the participating center during the time in question, we reviewed all new patient clinic encounters (n = 135,223). The demographic information between the full patient sample and our study subgroup appeared similar. Using our study cohort, two multivariable linear regression models were created to determine which traditional metrics (for example, self-reported race or insurance type) and patient-specific SDoH factors (for example, lack of reliable transportation) were associated with worse physical and mental health symptoms (that is, lower PROMIS scores) at new patient encounters. The variance inflation factor was used to assess for multicollinearity. For all analyses, p values < 0.05 designated statistical significance. The concept of minimum clinically important difference (MCID) was used to assess clinical importance.Regression coefficients represent the projected change in PROMIS physical or mental health symptom scores (that is, the dependent variable in our regression analyses) accounting for the other included variables. Thus, a regression coefficient for a given variable at or above a known MCID value suggests a clinical difference between those patients with and without the presence of that given characteristic. In this manuscript, regression coefficients at or above 4.2 (or at and below -4.2) for PROMIS Global Physical Health and at or above 5.1 (or at and below -5.1) for PROMIS Global Mental Health were considered clinically relevant. Results: Among the included patients, 8% (685 of 9057) were unemployed but seeking work, 4% (399 of 9057) reported transportation issues that could limit their ability to attend a medical appointment or acquire medications, 4% (328 of 9057) reported trouble paying for medications, and 2% (181 of 9057) had no current housing. Lack of reliable transportation to attend doctor visits or pick up medications (beta = -4.52 [95% CI -5.45 to -3.59]; p < 0.001), trouble paying for medications (beta = -4.55 [95% CI -5.55 to -3.54]; p < 0.001), Medicaid insurance (beta = -5.81 [95% CI -6.41 to -5.20]; p < 0.001), and workers compensation insurance (beta = -5.99 [95% CI -7.65 to -4.34]; p < 0.001) were associated with clinically worse function at presentation. Trouble paying for medications (beta = -6.01 [95% CI -7.10 to -4.92]; p < 0.001), Medicaid insurance (beta = -5.35 [95% CI -6.00 to -4.69]; p < 0.001), and workers compensation (beta = -6.07 [95% CI -7.86 to -4.28]; p < 0.001) were associated with clinically worse mental health at presentation. Conclusion: Although transportation issues and financial hardship were found to be associated with worse presenting physical function and mental health, Medicaid and workers compensation insurance remained associated with worse presenting physical function and mental health as well even after controlling for these more detailed, patient-level SDoH factors. Because of that, interventions to decrease health disparities should focus on not only sociodemographic variables (for example, insurance type) but also tangible patient-specific SDoH characteristics. For example, this may include giving patients taxi vouchers or ride-sharing credits to attend clinic visits for patients demonstrating such a need, initiating financial assistance programs for necessary medications, and/or identifying and connecting certain patient groups with social support services early on in the care cycle. Show less
The societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections.... Show moreThe societal burden of spinal conditions is vast and continues to grow with the in- creasing prevalence of patients with spinal degenerative disease, spinal metasta- ses, and spinal infections. Recent application of artificial intelligence in healthcare have shown great promise and similar extensions in spine surgery may improve decision-making. The purpose of this thesis was to examine the utility of predictive analytics and natural language processing in spine surgery. Show less
Background: Statistical models using machine learning (ML) have the potential for more accurate estimates of the probability of binary events than logistic regression. The present study used... Show moreBackground: Statistical models using machine learning (ML) have the potential for more accurate estimates of the probability of binary events than logistic regression. The present study used existing data sets from large musculoskeletal trauma trials to address the following study questions: (1) Do ML models produce better probability estimates than logistic regression models? (2) Are ML models influenced by different variables than logistic regression models? Methods: We created ML and logistic regression models that estimated the probability of a specific fracture (posterior malleolar involvement in distal spiral tibial shaft and ankle fractures, scaphoid fracture, and distal radial fracture) or adverse event (subsequent surgery [after distal biceps repair or tibial shaft fracture], surgical site infection, and postoperative delirium) using 9 data sets from published musculoskeletal trauma studies. Each data set was split into training (80%) and test (20%) subsets. Fivefold cross-validation of the training set was used to develop the ML models. The best-performing model was then assessed in the independent testing data. Performance was assessed by (1) discrimination (c-statistic), (2) calibration (slope and intercept), and (3) overall performance (Brier score). Results: The mean c-statistic was 0.01 higher for the logistic regression models compared with the best ML models for each data set (range, -0.01 to 0.06). There were fewer variables strongly associated with variation in the ML models, and many were dissimilar from those in the logistic regression models. Conclusions: The observation that ML models produce probability estimates comparable with logistic regression models for binary events in musculoskeletal trauma suggests that their benefit may be limited in this context. Show less
Bongers, M.E.R.; Karhade, A.V.; Setola, E.; Gambarotti, M.; Groot, O.Q.; Erdogan, K.E.; ... ; Palmerini, E. 2020
Background The Skeletal Oncology Research Group (SORG) machine learning algorithm for predicting survival in patients with chondrosarcoma was developed using data from the Surveillance,... Show moreBackground The Skeletal Oncology Research Group (SORG) machine learning algorithm for predicting survival in patients with chondrosarcoma was developed using data from the Surveillance, Epidemiology, and End Results (SEER) registry. This algorithm was externally validated on a dataset of patients from the United States in an earlier study, where it demonstrated generally good performance but overestimated 5-year survival. In addition, this algorithm has not yet been validated in patients outside the United States; doing so would be important because external validation is necessary as algorithm performance may be misleading when applied in different populations. Questions/purposes Does the SORG algorithm retain validity in patients who underwent surgery for primary chondrosarcoma outside the United States, specifically in Italy? Methods A total of 737 patients were treated for chondrosarcoma between January 2000 and October 2014 at the Italian tertiary care center which was used for international validation. We excluded patients whose first surgical procedure was performed elsewhere (n = 25), patients who underwent nonsurgical treatment (n = 27), patients with a chondrosarcoma of the soft tissue or skull (n = 60), and patients with peripheral, periosteal, or mesenchymal chondrosarcoma (n = 161). Thus, 464 patients were ultimately included in this external validation study, as the earlier performed SEER study was used as the training set. Therefore, this study-unlike most of this type-does not have a training and validation set. Although the earlier study overestimated 5-year survival, we did not modify the algorithm in this report, as this is the first international validation and the prior performance in the single-institution validation study from the United States may have been driven by a small sample or non-generalizable patterns related to its single-center setting. Variables needed for the SORG algorithm were manually collected from electronic medical records. These included sex, age, histologic subtype, tumor grade, tumor size, tumor extension, and tumor location. By inputting these variables into the algorithm, we calculated the predicted probabilities of survival for each patient. The performance of the SORG algorithm was assessed in this study through discrimination (the ability of a model to distinguish between a binary outcome), calibration (the agreement of observed and predicted outcomes), overall performance (the accuracy of predictions), and decision curve analysis (establishment on the ability of a model to make a decision better than without using the model). For discrimination, the c-statistic (commonly known as the area under the receiver operating characteristic curve for binary classification) was calculated; this ranged from 0.5 (no better than chance) to 1.0 (excellent discrimination). The agreement between predicted and observed outcomes was visualized with a calibration plot, and the calibration slope and intercept were calculated. Perfect calibration results in a slope of 1 and an intercept of 0. For overall performance, the Brier score and the null-model Brier score were calculated. The Brier score ranges from 0 (perfect prediction) to 1 (poorest prediction). Appropriate interpretation of the Brier score requires comparison with the null-model Brier score. The null-model Brier score is the score for an algorithm that predicts a probability equal to the population prevalence of the outcome for every patient.A decision curve analysis was performed to compare the potential net benefit of the algorithm versus other means of decision support, such as treating all or none of the patients. There were several differences between this study and the earlier SEER study, and such differences are important because they help us to determine the performance of the algorithm in a group different from the initial study population. In this study from Italy, 5-year survival was different from the earlier SEER study (71% [319 of 450 patients] versus 76% [1131 of 1487 patients]; p = 0.03). There were more patients with dedifferentiated chondrosarcoma than in the earlier SEER study (25% [118 of 464 patients] versus 8.5% [131 of 1544 patients]; p < 0.001). In addition, in this study patients were older, tumor size was larger, and there were higher proportions of high-grade tumors than the earlier SEER study (age: 56 years [interquartile range {IQR} 42 to 67] versus 52 years [IQR 40 to 64]; p = 0.007; tumor size: 80 mm [IQR 50 to 120] versus 70 mm [IQR 42 to 105]; p < 0.001; tumor grade: 22% [104 of 464 had Grade 1], 42% [196 of 464 had Grade 2], and 35% [164 of 464 had Grade 3] versus 41% [592 of 1456 had Grade 1], 40% [588 of 1456 had Grade 2], and 19% [276 of 1456 had Grade 3]; p <= 0.001). Results Validation of the SORG algorithm in a primarily Italian population achieved a c-statistic of 0.86 (95% confidence interval 0.82 to 0.89), suggesting good-to-excellent discrimination. The calibration plot showed good agreement between the predicted probability and observed survival in the probability thresholds of 0.8 to 1.0. With predicted survival probabilities lower than 0.8, however, the SORG algorithm underestimated the observed proportion of patients with 5-year survival, reflected in the overall calibration intercept of 0.82 (95% CI 0.67 to 0.98) and calibration slope of 0.68 (95% CI 0.42 to 0.95). The Brier score for 5-year survival was 0.15, compared with a null-model Brier of 0.21. The algorithm showed a favorable decision curve analysis in the validation cohort. Conclusions The SORG algorithm to predict 5-year survival for patients with chondrosarcoma held good discriminative ability and overall performance on international external validation; however, it underestimated 5-year survival for patients with predicted probabilities from 0 to 0.8 because the calibration plot was not perfectly aligned for the observed outcomes, which resulted in a maximum underestimation of 20%. The differences may reflect the baseline differences noted between the two study populations. The overall performance of the algorithm supports the utility of the algorithm and validation presented here. The freely available digital application for the algorithm is available here: https://sorg-apps. shinyapps.io/extremitymetssurvival/. Show less
Groot, O.Q.; Bongers, M.E.R.; Karhade, A.V.; Kapoor, N.D.; Fenn, B.P.; Kim, J.; ... ; Schwab, J.H. 2020
Background The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However,... Show moreBackground The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However, processing this rich resource of data for clinical and research purposes, depends on labor-intensive and potentially error-prone manual review. The aim of this study was to develop a natural language processing (NLP) algorithm for binary classification (single metastasis versus two or more metastases) in bone scintigraphy reports of patients undergoing surgery for bone metastases. Material and methods Bone scintigraphy reports of patients undergoing surgery for bone metastases were labeled each by three independent reviewers using a binary classification (single metastasis versus two or more metastases) to establish a ground truth. A stratified 80:20 split was used to develop and test an extreme-gradient boosting supervised machine learning NLP algorithm. Results A total of 704 free-text bone scintigraphy reports from 704 patients were included in this study and 617 (88%) had multiple bone metastases. In the independent test set (n = 141) not used for model development, the NLP algorithm achieved an 0.97 AUC-ROC (95% confidence interval [CI], 0.92-0.99) for classification of multiple bone metastases and an 0.99 AUC-PRC (95% CI, 0.99-0.99). At a threshold of 0.90, NLP algorithm correctly identified multiple bone metastases in 117 of the 124 who had multiple bone metastases in the testing cohort (sensitivity 0.94) and yielded 3 false positives (specificity 0.82). At the same threshold, the NLP algorithm had a positive predictive value of 0.97 and F1-score of 0.96. Conclusions NLP has the potential to automate clinical data extraction from free text radiology notes in orthopedics, thereby optimizing the speed, accuracy, and consistency of clinical chart review. Pending external validation, the NLP algorithm developed in this study may be implemented as a means to aid researchers in tackling large amounts of data. Show less
OBJECTIVE Incidental durotomy is a common complication of elective lumbar spine surgery seen in up to 11% of cases. Prior studies have suggested patient age and body habitus along with a history of... Show moreOBJECTIVE Incidental durotomy is a common complication of elective lumbar spine surgery seen in up to 11% of cases. Prior studies have suggested patient age and body habitus along with a history of prior surgery as being associated with an increased risk of dural tear. To date, no calculator has been developed for quantifying risk. Here, the authors' aim was to identify independent predictors of incidental durotomy, present a novel predictive calculator, and externally validate a novel method to identify incidental durotomies using natural language processing (NLP).METHODS The authors retrospectively reviewed all patients who underwent elective lumbar spine procedures at a tertiary academic hospital for degenerative pathologies between July 2016 and November 2018. Data were collected regarding surgical details, patient demographic information, and patient medical comorbidities. The primary outcome was incidental durotomy, which was identified both through manual extraction and the NLP algorithm. Multivariable logistic regression was used to identify independent predictors of incidental durotomy. Bootstrapping was then employed to estimate optimism in the model, which was corrected for; this model was converted to a calculator and deployed online.RESULTS Of the 1279 elective lumbar surgery patients included in this study, incidental durotomy occurred in 108 (8.4%). Risk factors for incidental durotomy on multivariable logistic regression were increased surgical duration, older age, revision versus index surgery, and case starts after 4 PM. This model had an area under curve (AUC) of 0.73 in predicting incidental durotomies. The previously established NLP method was used to identify cases of incidental durotomy, of which it demonstrated excellent discrimination (AUC 0.97).CONCLUSIONS Using multivariable analysis, the authors found that increased surgical duration, older patient age, cases started after 4 PM, and a history of prior spine surgery are all independent positive predictors of incidental durotomy in patients undergoing elective lumbar surgery. Additionally, the authors put forth the first version of a clinical calculator for durotomy risk that could be used prospectively by spine surgeons when counseling patients about their surgical risk. Lastly, the authors presented an external validation of an NLP algorithm used to identify incidental durotomies through the review of free-text operative notes. The authors believe that these tools can aid clinicians and researchers in their efforts to prevent this costly complication in spine surgery. Show less
BackgroundA preoperative estimation of survival is critical for deciding on the operative management of metastatic bone disease of the extremities. Several tools have been developed for this... Show moreBackgroundA preoperative estimation of survival is critical for deciding on the operative management of metastatic bone disease of the extremities. Several tools have been developed for this purpose, but there is room for improvement. Machine learning is an increasingly popular and flexible method of prediction model building based on a data set. It raises some skepticism, however, because of the complex structure of these models.Questions/purposesThe purposes of this study were (1) to develop machine learning algorithms for 90-day and 1-year survival in patients who received surgical treatment for a bone metastasis of the extremity, and (2) to use these algorithms to identify those clinical factors (demographic, treatment related, or surgical) that are most closely associated with survival after surgery in these patients.MethodsAll 1090 patients who underwent surgical treatment for a long-bone metastasis at two institutions between 1999 and 2017 were included in this retrospective study. The median age of the patients in the cohort was 63 years (interquartile range [IQR] 54 to 72 years), 56% of patients (610 of 1090) were female, and the median BMI was 27 kg/m(2) (IQR 23 to 30 kg/m(2)). The most affected location was the femur (70%), followed by the humerus (22%). The most common primary tumors were breast (24%) and lung (23%). Intramedullary nailing was the most commonly performed type of surgery (58%), followed by endoprosthetic reconstruction (22%), and plate screw fixation (14%). Missing data were imputed using the missForest methods. Features were selected by random forest algorithms, and five different models were developed on the training set (80% of the data): stochastic gradient boosting, random forest, support vector machine, neural network, and penalized logistic regression. These models were chosen as a result of their classification capability in binary datasets. Model performance was assessed on both the training set and the validation set (20% of the data) by discrimination, calibration, and overall performance.ResultsWe found no differences among the five models for discrimination, with an area under the curve ranging from 0.86 to 0.87. All models were well calibrated, with intercepts ranging from -0.03 to 0.08 and slopes ranging from 1.03 to 1.12. Brier scores ranged from 0.13 to 0.14. The stochastic gradient boosting model was chosen to be deployed as freely available web-based application and explanations on both a global and an individual level were provided. For 90-day survival, the three most important factors associated with poorer survivorship were lower albumin level, higher neutrophil-to-lymphocyte ratio, and rapid growth primary tumor. For 1-year survival, the three most important factors associated with poorer survivorship were lower albumin level, rapid growth primary tumor, and lower hemoglobin level.ConclusionsAlthough the final models must be externally validated, the algorithms showed good performance on internal validation. The final models have been incorporated into a freely accessible web application that can be found at https://sorg-apps.shinyapps.io/extremitymetssurvival/. Pending external validation, clinicians may use this tool to predict survival for their individual patients to help in shared treatment decision making.Level of EvidenceLevel III, therapeutic study. Show less
OBJECTIVE Nonroutine discharge after elective spine surgery increases healthcare costs, negatively impacts patient satisfaction, and exposes patients to additional hospital-acquired complications.... Show moreOBJECTIVE Nonroutine discharge after elective spine surgery increases healthcare costs, negatively impacts patient satisfaction, and exposes patients to additional hospital-acquired complications. Therefore, prediction of nonroutine discharge in this population may improve clinical management. The authors previously developed a machine learning algorithm from national data that predicts risk of nonhome discharge for patients undergoing surgery for lumbar disc disorders. In this paper the authors externally validate their algorithm in an independent institutional population of neurosurgical spine patients.METHODS Medical records from elective inpatient surgery for lumbar disc herniation or degeneration in the Transitional Care Program at Brigham and Women's Hospital (2013-2015) were retrospectively reviewed. Variables included age, sex, BMI, American Society of Anesthesiologists (ASA) class, preoperative functional status, number of fusion levels, comorbidities, preoperative laboratory values, and discharge disposition. Nonroutine discharge was defined as postoperative discharge to any setting other than home. The discrimination (c-statistic), calibration, and positive and negative predictive values (PPVs and NPVs) of the algorithm were assessed in the institutional sample.RESULTS Overall, 144 patients underwent elective inpatient surgery for lumbar disc disorders with a nonroutine discharge rate of 6.9% (n = 10). The median patient age was 50 years and 45.1% of patients were female. Most patients were ASA class II (66.0%), had 1 or 2 levels fused (80.6%), and had no diabetes (91.7%). The median hematocrit level was 41.2%. The neural network algorithm generalized well to the institutional data, with a c-statistic (area under the receiver operating characteristic curve) of 0.89, calibration slope of 1.09, and calibration intercept of -0.08. At a threshold of 0.25, the PPV was 0.50 and the NPV was 0.97.CONCLUSIONS This institutional external validation of a previously developed machine learning algorithm suggests a reliable method for identifying patients with lumbar disc disorder at risk for nonroutine discharge. Performance in the institutional cohort was comparable to performance in the derivation cohort and represents an improved predictive value over clinician intuition. This finding substantiates initial use of this algorithm in clinical practice. This tool may be used by multidisciplinary teams of case managers and spine surgeons to strategically invest additional time and resources into postoperative plans for this population. Show less
Background Fluorescence-guided surgery (FGS) is a technique used to enhance visualization of tumor margins in order to increase the extent of tumor resection in glioma surgery. In this paper, we... Show moreBackground Fluorescence-guided surgery (FGS) is a technique used to enhance visualization of tumor margins in order to increase the extent of tumor resection in glioma surgery. In this paper, we systematically review all clinically tested fluorescent agents for application in FGS for glioma and all preclinically tested agents with the potential for FGS for glioma.Methods We searched the PubMed and Embase databases for all potentially relevant studies through March 2016. We assessed fluorescent agents by the following outcomes: rate of gross total resection (GTR), overall and progression-free survival, sensitivity and specificity in discriminating tumor and healthy brain tissue, tumor-to-normal ratio of fluorescent signal, and incidence of adverse events.Results The search strategy resulted in 2155 articles that were screened by titles and abstracts. After full-text screening, 105 articles fulfilled the inclusion criteria evaluating the following fluorescent agents: 5-aminolevulinic acid (5-ALA) (44 studies, including three randomized control trials), fluorescein (11), indocyanine green (five), hypericin (two), 5aminofluorescein- human serum albumin (one), endogenous fluorophores (nine) and fluorescent agents in a pre-clinical testing phase (30). Three meta-analyses were also identified.Conclusions 5-ALA is the only fluorescent agent that has been tested in a randomized controlled trial and results in an improvement of GTR and progression-free survival in highgrade gliomas. Observational cohort studies and case series suggest similar outcomes for FGS using fluorescein. Molecular targeting agents (e.g., fluorophore/nanoparticle labeled with anti-EGFR antibodies) are still in the pre-clinical phase, but offer promising results and may be valuable future alternatives. Show less