Objectives: To describe low dose Computed Tomography (ldCT) Hounsfield Units (HU) two-year change-from -baseline values (expressing trabecular bone density changes) and analyse their inter-reader... Show moreObjectives: To describe low dose Computed Tomography (ldCT) Hounsfield Units (HU) two-year change-from -baseline values (expressing trabecular bone density changes) and analyse their inter-reader reliability per vertebra in radiographic axial spondyloarthritis (r-axSpA). Methods: We used 49 patients with r-axSpA from the multicentre two-year Sensitive Imaging in Ankylosing Spondylitis (SIAS) study. LdCT HU were independently measured by two trained readers at baseline and two years. Mean (standard deviation, SD) for the change-from-baseline HU values were provided per vertebra by reader. Intraclass correlation coefficients (ICC; absolute agreement, two-way random effect), Bland-Altman plots and smallest detectable change (SDC) were obtained. Percentages of vertebrae in which readers agreed on the direction of change and on change >|SDC| were computed. Results: Overall, 1,053 (98% of all possible) vertebrae were assessed by each reader both at baseline and two years. Over two years, HU mean change values varied from-23 to 28 and 29 for reader 1 and 2, respectively. Inter-reader reliability of the two-year change-from-baseline values per vertebra was excellent: ICC:0.91-0.99; SDC:6-10; Bland-Altman plots were homoscedastic, with negligible systematic error between readers. Readers agreed on the direction of change in 88-96% and on change >|SDC| in 58-94% of vertebrae, per vertebral level, from C3 to L5. Overall, similar results were obtained across all vertebrae. Conclusion: LdCT measurement of HU is a reliable method to assess two-year changes in trabecular bone density at each vertebra from C3-L5. Being reliable across all vertebrae, this methodology can aid the study of trabecular bone density changes over time in r-axSpA, a disease affecting the whole spine. Show less
Purpose The Canadian Occupational Performance Measure (COPM) is used to inventory problems experienced by the patient to set goals and evaluate treatment. We aimed to make a systematic overview of... Show morePurpose The Canadian Occupational Performance Measure (COPM) is used to inventory problems experienced by the patient to set goals and evaluate treatment. We aimed to make a systematic overview of measurement properties for people in geriatric rehabilitation. Methods Seven electronic databases were searched for psychometric studies investigating content validity, construct validity, responsiveness, or reliability of the COPM in geriatric rehabilitation populations aged >= 60 years. Two reviewers independently abstracted data and assessed methodological quality from included studies. Results Of 292 identified articles, 13 studies were included. The COPM showed good test-retest reliability (two studies), moderate inter-rater reliability (one study), and good content validity (one study with some risk of bias). Four studies with minimal risk of bias showed good construct validity as their hypotheses were confirmed. Responsiveness was moderate in three studies with adequate methodological quality. Conclusion All measurement properties have been studied in geriatric rehabilitation populations, and indicate that the COPM gives relevant information for geriatric rehabilitation, and scores can be assessed reliably and are responsive to change. Although there were many studies on construct validity, authors had different opinions on what exactly COPM-scores tell us, as they used a variety of comparator instruments and different hypotheses. Consensus on exact interpretation of the scores is needed.Key summary pointsAim To make a systematic overview of measurement properties of the Canadian Occupational Performance Measure (COPM) for people in geriatric rehabilitation. Findings COPM showed moderate inter-rater reliability, good test-retest reliability, good content and construct validity, and moderate responsiveness in geriatric rehabilitation. When studying construct validity, authors used a variety of comparator instruments and different hypotheses. Message This overview of properties shows that the COPM gives relevant information for geriatric rehabilitation, and scores can be assessed reliably and are responsive to change. Show less
The primary aim of this thesis was to investigate the complex relationship between pain, neuropsychiatric symptoms, and ADL functioning in persons with dementia. Furthermore, we studied the... Show moreThe primary aim of this thesis was to investigate the complex relationship between pain, neuropsychiatric symptoms, and ADL functioning in persons with dementia. Furthermore, we studied the psychometric properties of a new and universal observational pain assessment instrument Pain Assessment In Impaired Cognition: PAIC.The results of this thesis show that pain in nursing home residents with dementia is related to a decline in ADL functions, independent of dementia severity. Specifically, a decline in the ADL activities transferring and bathing.Additionally, the psychometric evaluation of the PAIC presented in this thesis not only results in a promising measurement instrument, but also provides useful information for the development and improvement of educational programmes that contribute to the utilization of the PAIC15. Show less
Background and objectives The modified Rankin Scale (mRS) is one of the most frequently used outcome measures in trials in patients with an aneurysmal subarachnoid hemorrhage (aSAH). The assessment... Show moreBackground and objectives The modified Rankin Scale (mRS) is one of the most frequently used outcome measures in trials in patients with an aneurysmal subarachnoid hemorrhage (aSAH). The assessment method of the mRS is often not clearly described in trials, while the method used might influence the mRS score. The aim of this study is to evaluate the inter-method reliability of different assessment methods of the mRS.Methods This is a prospective, randomized, multicenter study with follow-up at 6 weeks and 6 months. Patients aged >= 18 years with aSAH were randomized to either a structured interview or a self-assessment of the mRS. Patients were seen by a physician who assigned an mRS score, followed by either the structured interview or the self-assessment. Inter-method reliability was assessed with the quadratic weighted kappa score and percentage of agreement. Assessment of feasibility of the self-assessment was done by a feasibility questionnaire.Results The quadratic weighted kappa was 0.60 between the assessment of the physician and structured interview and 0.56 between assessment of the physician and self-assessment. Percentage agreement was, respectively, 50.8 and 19.6%. The assessment of the mRS through a structured interview and by self-assessment resulted in systematically higher mRS scores than the mRS scored by the physician. Self-assessment of the mRS was proven feasible.Discussion The mRS scores obtained with different assessment methods differ significantly. The agreement between the scores is low, although the reliability between the assessment methods is good. This should be considered when using the mRS in clinical trials. Show less
Ataei, A.; Eggermont, F.; Baars, M.; Linden, Y. van der; Rooy, J. de; Verdonschot, N.; Tanck, E. 2021
Purpose Accurate identification of metastatic lesions is important for improvement in biomechanical models that calculate the fracture risk of metastatic bones. The aim of this study was therefore... Show morePurpose Accurate identification of metastatic lesions is important for improvement in biomechanical models that calculate the fracture risk of metastatic bones. The aim of this study was therefore to assess the inter- and intra-operator reliability of manual segmentation of femoral metastatic lesions. Methods CT scans of 54 metastatic femurs (19 osteolytic, 17 osteoblastic, and 18 mixed) were segmented two times by two operators. Dice coefficients (DCs) were calculated adopting the quantification that a DC>0.7 indicates good reliability. Results Generally, rather poor inter- and intra-operator reliability of lesion segmentation were found. Inter-operator DCs were 0.54 (+/- 0.28) and 0.50 (+/- 0.32) for the first and second segmentations, respectively, whereas intra-operator DCs were 0.56 (+/- 0.28) for operator I and 0.71 (+/- 0.23) for operator II. Larger lesions scored significantly higher DCs in comparison with smaller lesions. Of the femurs with larger mean segmentation volumes, 83% and 93% were segmented with good inter- and intra-operator DCs (> 0.7), respectively. There was no difference between the mean DCs of osteolytic, osteoblastic, and mixed lesions. Conclusion Manual segmentation of femoral bone metastases is very challenging and resulted in unsatisfactory mean reliability values. There is a need for development of a segmentation protocol to reduce the inter- and intra-operator segmentation variation as the first step and use of computer-assisted segmentation tools as a second step as this study shows that manual segmentation of femoral metastatic lesions is highly challenging. Show less
Background: Pancreatic neuroendocrine tumors (pNETs) have a high prevalence in patients with multiple endocrine neoplasia type 1 (MEN1) and are the leading cause of death. Tumor size is still... Show moreBackground: Pancreatic neuroendocrine tumors (pNETs) have a high prevalence in patients with multiple endocrine neoplasia type 1 (MEN1) and are the leading cause of death. Tumor size is still regarded as the main prognostic factor and therefore used for surgical decision-making. We assessed reliability and agreement of radiological and pathological tumor size in a population-based cohort of patients with MEN1-related pNETs. Methods: Patients were selected from the Dutch MEN1 database if they had undergone a resection for a pNET between 2003 and 2018. Radiological (MRI, CT, and endoscopic ultrasonography [EUS]) and pathological tumor size were collected from patient records. Measures of agreement (Bland-Altman plots with limits of agreement [LoA] and absolute agreement) and reliability (intraclass correlation coefficients [ICC] and unweighted kappa) were calculated for continuous and categorized (< or >= 2 cm) pNET size. Results: In 73 included patients, the median radiological and pathological tumor sizes measured were 22 (3-160) and 21 (4-200) mm, respectively. Mean bias between radiological and pathological tumor size was -0.2 mm and LoA ranged from -12.9 to 12.6 mm. For the subgroups of MRI, CT, and EUS, LoA of radiological and pathological tumor size ranged from -9.6 to 10.9, -15.9 to 15.8, and -13.9 to 11.0, respectively. ICCs for the overall cohort, MRI, CT, and EUS were 0.80, 0.86, 0.75, and 0.76, respectively. Based on the 2 cm criterion, agreement was 81.5%; hence, 12 patients (18.5%) were classified differently between imaging and pathology. Absolute agreement and kappa values of MRI, CT, and EUS were 88.6, 85.7, and 75.0%, and 0.77, 0.71, and 0.50, respectively. Conclusion: Within a population-based cohort, MEN1-related pNET size was not systematically over- or underestimated on preoperative imaging. Based on agreement and reliability measures, MRI is the preferred imaging modality. Show less
This White Paper by the European Society for Swallowing Disorders (ESSD) reports on the current state of screening and non-instrumental assessment for dysphagia in adults. An overview is provided... Show moreThis White Paper by the European Society for Swallowing Disorders (ESSD) reports on the current state of screening and non-instrumental assessment for dysphagia in adults. An overview is provided on the measures that are available, and how to select screening tools and assessments. Emphasis is placed on different types of screening, patient-reported measures, assessment of anatomy and physiology of the swallowing act, and clinical swallowing evaluation. Many screening and non-instrumental assessments are available for evaluating dysphagia in adults; however, their use may not be warranted due to poor diagnostic performance or lacking robust psychometric properties. This white paper provides recommendations on how to select best evidence-based screening tools and non-instrumental assessments for use in clinical practice targeting different constructs, target populations and respondents, based on criteria for diagnostic performance, psychometric properties (reliability, validity, and responsiveness), and feasibility. In addition, gaps in research that need to be addressed in future studies are discussed. The following recommendations are made: (1) discontinue the use of non-validated dysphagia screening tools and assessments; (2) implement screening using tools that have optimal diagnostic performance in selected populations that are at risk of dysphagia, such as stroke patients, frail older persons, patients with progressive neurological diseases, persons with cerebral palsy, and patients with head and neck cancer; (3) implement measures that demonstrate robust psychometric properties; and (4) provide quality training in dysphagia screening and assessment to all clinicians involved in the care and management of persons with dysphagia. Show less
The Levels of Emotional Awareness Scale (LEAS) is a well-validated performance measure of trait emotional awareness (EA), which is associated with psychological and physical problems. EA is,... Show moreThe Levels of Emotional Awareness Scale (LEAS) is a well-validated performance measure of trait emotional awareness (EA), which is associated with psychological and physical problems. EA is, however, expected to vary over time and we aimed to adapt the LEAS to permit the measurement of EA in daily life as a function of momentary state. Twenty-five students completed 12 ecological momentary assessments (EMAs) of EA across 2 days. The correlation between the mean EMAs of EA and trait EA, and the change over time in EA, was also examined. Findings revealed a significant positive correlation between state and trait EA. The within-person reliability was substantial, suggesting that EMAs can reliably assess EA over time across individuals. Importantly, latent state-trait analysis showed that about 50% of EA variability was due to state variance whereas only 2% of EA variability was due to trait variance. Preliminary psychometric properties suggest that the developed method allows for the measurement of EA in daily life and supports the claim that EA can be measured using both hypothetical (as in the LEAS) and real-life (using EMAs) scenarios. Show less
The Deglutition Handicap Index (DHI) is a self-report measure for patients at risk of oropharyngeal dysphagia on deglutition-related aspects of functional health status (FHS) and health-related... Show moreThe Deglutition Handicap Index (DHI) is a self-report measure for patients at risk of oropharyngeal dysphagia on deglutition-related aspects of functional health status (FHS) and health-related quality of life (HR-QoL). The DHI consists of 30 items which are subsumed within the Symptom, Functional and Emotional subscales. The purpose of this study was to evaluate the psychometric properties of the DHI using Classic Test Theory according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) criteria. A total of 453 patients with dysphagia with different aetiologies were recruited concurrently at two academic hospitals. Dysphagia was confirmed by fiberoptic endoscopic and/or videofluoroscopic evaluation of swallowing. In addition, a healthy control group of 132 participants were recruited. Structural validity was determined using exploratory and confirmatory factor analyses and internal consistency by calculating Cronbach's alpha coefficients. Hypothesis testing was evaluated using Mann-Whitney U-tests, linear regression analysis and correlations analysis. Diagnostic performance and receiver operating characteristic curves analysis were calculated. Factor analyses indicated that the DHI is a unidimensional measure. The DHI has good internal consistency with some indication of item redundancy, weak to moderate structural validity and strong hypothesis testing for construct validity. The DHI shows high diagnostic performance as part of criterion validity. These findings support that the DHI is an appropriate choice as a patient self-report measure to evaluate FHS and HR-QoL in dysphagia. Ongoing validation to assess the measure for possible item redundancy and to examine the dimensionality of the DHI using item response theory is recommended. Show less
Adequate reliability of measurement is a precondition for investigating individual differences and age-related changes in brain structure. One approach to improve reliability is to identify and... Show moreAdequate reliability of measurement is a precondition for investigating individual differences and age-related changes in brain structure. One approach to improve reliability is to identify and control for variables that are predictive of within-person variance. To this end, we applied both classical statistical methods and machine-learning-inspired approaches to structural magnetic resonance imaging (sMRI) data of six participants aged 24–31 years gathered at 40–50 occasions distributed over 6–8 months from the Day2day study. We explored the within-person associations between 21 variables covering physiological, affective, social, and environmental factors and global measures of brain volume estimated by VBM8 and FreeSurfer. Time since the first scan was reliably associated with Freesurfer estimates of grey matter volume and total cortex volume, in line with a rate of annual brain volume shrinkage of about 1 percent. For the same two structural measures, time of day also emerged as a reliable predictor with an estimated diurnal volume decrease of, again, about 1 percent. Furthermore, we found weak predictive evidence for the number of steps taken on the previous day and testosterone levels. The results suggest a need to control for time-of-day effects in sMRI research. In particular, we recommend that researchers interested in assessing longitudinal change in the context of intervention studies or longitudinal panels make sure that, at each measurement occasion, (a) a given participant is measured at the same time of day; (b) all participants are measured at about the same time of day. Furthermore, the potential effects of physical activity, including moderate amounts of aerobic exercise, and testosterone levels on MRI-based measures of brain structure deserve further investigation. Show less
BackgroundThis systematic review examined the methodological quality of studies and assessed the psychometric qualities of interview-administered Past-week and Usual-week Physical Activity... Show moreBackgroundThis systematic review examined the methodological quality of studies and assessed the psychometric qualities of interview-administered Past-week and Usual-week Physical Activity Questionnaires (PAQs). Pubmed and Embase were used to retrieve data sources.MethodsThe studies were selected using the following eligibility criteria: 1) psychometric properties of PAQs were assessed in adults; 2) the PAQs either consisted of recall periods of usual 7-days (Usual-week PAQs) within the past 12months or during the past 7-days (Past-week PAQs); and 3) PAQs were interview-administered. The COSMIN taxonomy was utilised to critically appraise study quality and a previously established psychometric criteria employed to evaluate the overall psychometric qualities.ResultsFollowing screening, 42 studies were examined to determine the psychometric properties of 20 PAQs, with the majority of studies demonstrating good to excellent ratings for methodological quality. For convergent validity (i.e., the relationship between PAQs and other measures), similar overall associations were found between Past-week PAQs and Usual-week PAQs. However, PAQs were more strongly associated with direct measures of physical activity (e.g., accelerometer) than indirect measures of physical activity (i.e., physical fitness), irrespective of recall methods. Very few psychometric properties were examined for each PAQ, with the majority exhibiting poor ratings in psychometric quality. Only a few interview-administered PAQs exhibited positive ratings for a single psychometric property, although the other properties were either rated as poor or questionable, demonstrating the limitations of current PAQs.ConclusionAccordingly, further research is necessary to explore a greater number of psychometric properties, or to develop new PAQs by addressing the psychometric limitations identified in the current review. Show less
Speyer, R.; Kim, J.H.; Doma, K.; Chen, Y.W.; Denman, D.; Phyland, D.; ... ; Cordier, R. 2019
PurposeThe current review was conducted to identify all self-report questionnaires on functional health status (FHS) and/or health-related quality-of-life (HR-QoL) in adult populations with... Show morePurposeThe current review was conducted to identify all self-report questionnaires on functional health status (FHS) and/or health-related quality-of-life (HR-QoL) in adult populations with dysphonia (voice problems), and to evaluate the psychometric properties of the retrieved questionnaires.MethodsA systematic review was performed in the electronic literature databases PubMed and Embase. The psychometric properties of the questionnaires were determined using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) taxonomy and checklist. Responsiveness was outside the scope of this review and as no agreed gold standard' measures are available in the field of FHS and HR-QoL in dysphonia, criterion validity was not assessed. Only questionnaires developed and published in English were included.ResultsForty-eight studies reported on the psychometric properties of 15 identified questionnaires. As many psychometric data were missing or resulted from biased study designs or statistical analyses, only preliminary conclusions can be drawn. Based on the current available psychometric evidence in the literature, the Voice Handicap Index seems to be the most promising questionnaire, followed by the Vocal Performance Questionnaire.ConclusionsMore research is needed to complete missing data on psychometric properties of existing questionnaires in FHS and/or HR-QoL. Further, when developing new questionnaires, the use of item response theory is preferred above classical testing theory, as well as international consensus-based psychometric definitions and criteria to avoid bias in outcome data on measurement properties. Show less
Kool, M.; Bastiaannet, E.; Velde, C.J.H. van de; Marang-van de Mheen, P.J. 2018
As early diagnosis of swallowing and feeding difficulties in infants and children is of utmost importance, there is a need to evaluate the quality of the psychometric properties of pediatric... Show moreAs early diagnosis of swallowing and feeding difficulties in infants and children is of utmost importance, there is a need to evaluate the quality of the psychometric properties of pediatric assessments of swallowing and feeding. A systematic review was performed summarizing the psychometric properties of non-instrumental assessments for swallowing and feeding difficulties in pediatrics; no data were identified for the remaining twelve assessments. The COSMIN taxonomy and checklist were used to evaluate the methodological quality of 23 publications on psychometric properties. For each assessment, an overall quality score for each measurement property was determined. As psychometric data proved incomplete, conflicting or indeterminate for all assessments, only preliminary conclusions could be drawn; the most robust assessment based on current data is the dysphagia disorder survey (DDS). However, further research is needed to provide additional information on all psychometric properties for all assessments. Show less
Brown, K.E.; Lohse, K.R.; Mayer, I.M.S.; Strigaro, G.; Desikan, M.; Casula, E.P.; ... ; Orth, M. 2017
Early and reliable screening for oropharyngeal dysphagia (OD) symptoms in at-risk populations is important and a crucial first stage in effective OD management. The Eating Assessment Tool (EAT-10)... Show moreEarly and reliable screening for oropharyngeal dysphagia (OD) symptoms in at-risk populations is important and a crucial first stage in effective OD management. The Eating Assessment Tool (EAT-10) is a commonly utilized screening and outcome measure. To date, studies using classic test theory methodologies report good psychometric properties, but the EAT-10 has not been evaluated using item response theory (e.g., Rasch analysis). The aim of this multisite study was to evaluate the internal consistency and structural validity and conduct a preliminary investigation of the cross-cultural validity of the EAT-10; floor and ceiling effects were also checked. Participants involved 636 patients deemed at risk of OD, from outpatient clinics in Spain, Turkey, Sweden, and Italy. The EAT-10 and videofluoroscopic and/or fiberoptic endoscopic evaluation of swallowing were used to confirm OD diagnosis. Patients with esophageal dysphagia were excluded to ensure a homogenous sample. Rasch analysis was used to investigate person and item fit statistics, response scale, dimensionality of the scale, differential item functioning (DIF), and floor and ceiling effect. The results indicate that the EAT-10 has significant weaknesses in structural validity and internal consistency. There are both item redundancy and lack of easy and difficult items. The thresholds of the rating scale categories were disordered and gender, confirmed OD, and language, and comorbid diagnosis showed DIF on a number of items. DIF analysis of language showed preliminary evidence of problems with cross-cultural validation, and the measure showed a clear floor effect. The authors recommend redevelopment of the EAT-10 using Rasch analysis. Show less