The prospective, multicenter TESTBREAST study was initiated with the aim of identifying a novel panel of blood-based protein biomarkers to enable early breast cancer detection for moderate-to-high... Show moreThe prospective, multicenter TESTBREAST study was initiated with the aim of identifying a novel panel of blood-based protein biomarkers to enable early breast cancer detection for moderate-to-high-risk women. Serum samples were collected every (half) year up until diagnosis. Protein levels were longitudinally measured to determine intrapatient and interpatient variabilities. To this end, protein cluster patterns were evaluated to form a conceptual basis for further clinical analyses. Using a mass spectrometry-based bottom-up proteomics strategy, the protein abundance of 30 samples was analyzed: five sequential serum samples from six high-risk women; three who developed a breast malignancy (cases) and three who did not (controls). Serum samples were chromatographically fractionated and an in-depth serum proteome was acquired. Cluster analyses were applied to indicate differences between and within protein levels in serum samples of individuals. Statistical analyses were performed using ANOVA to select proteins with a high level of clustering. Cluster analyses on 30 serum samples revealed unique patterns of protein clustering for each patient, indicating a greater interpatient than intrapatient variability in protein levels of the longitudinally acquired samples. Moreover, the most distinctive proteins in the cluster analysis were identified. Strong clustering patterns within longitudinal intrapatient samples have demonstrated the importance of identifying small changes in protein levels for individuals over time. This underlines the significance of longitudinal serum measurements, that patients can serve as their own controls, and the relevance of the current study set-up for early detection. The TESTBREAST study will continue its pursuit toward establishing a protein panel for early breast cancer detection. Show less
The prospective, multicenter TESTBREAST study was initiated with the aim of identifying a novel panel of blood-based protein biomarkers to enable early breast cancer detection for moderate-to-high... Show moreThe prospective, multicenter TESTBREAST study was initiated with the aim of identifying a novel panel of blood-based protein biomarkers to enable early breast cancer detection for moderate-to-high-risk women. Serum samples were collected every (half) year up until diagnosis. Protein levels were longitudinally measured to determine intrapatient and interpatient variabilities. To this end, protein cluster patterns were evaluated to form a conceptual basis for further clinical analyses. Using a mass spectrometry-based bottom-up proteomics strategy, the protein abundance of 30 samples was analyzed: five sequential serum samples from six high-risk women; three who developed a breast malignancy (cases) and three who did not (controls). Serum samples were chromatographically fractionated and an in-depth serum proteome was acquired. Cluster analyses were applied to indicate differences between and within protein levels in serum samples of individuals. Statistical analyses were performed using ANOVA to select proteins with a high level of clustering. Cluster analyses on 30 serum samples revealed unique patterns of protein clustering for each patient, indicating a greater interpatient than intrapatient variability in protein levels of the longitudinally acquired samples. Moreover, the most distinctive proteins in the cluster analysis were identified. Strong clustering patterns within longitudinal intrapatient samples have demonstrated the importance of identifying small changes in protein levels for individuals over time. This underlines the significance of longitudinal serum measurements, that patients can serve as their own controls, and the relevance of the current study set-up for early detection. The TESTBREAST study will continue its pursuit toward establishing a protein panel for early breast cancer detection. Show less
For the measles-mumps-rubella (MMR) vaccine, the World Health Organization-recommended coverage for herd protection is 95% for measles and 80% for rubella and mumps. However, a national vaccine... Show moreFor the measles-mumps-rubella (MMR) vaccine, the World Health Organization-recommended coverage for herd protection is 95% for measles and 80% for rubella and mumps. However, a national vaccine coverage does not reflect social clustering of unvaccinated children, e.g. in schools of Orthodox Protestant or Anthroposophic identity in The Netherlands. To fully characterise this clustering, we estimated one-dose MMR vaccination coverages at all schools in the Netherlands. By combining postcode catchment areas of schools and school feeder data, each child in the Netherlands was characterised by residential postcode, primary and secondary school (referred to as school career). Postcode-level vaccination data were used to estimate vaccination coverages per school career. These were translated to coverages per school, stratified by school identity. Most schools had vaccine coverages over 99%, but major exceptions were Orthodox Protestant schools (63% in primary and 58% in secondary schools) and Anthroposophic schools (67% and 78%). School-level vaccine coverage estimates reveal strong clustering of unvaccinated children. The school feeder data reveal strongly connected Orthodox Protestant and Anthroposophic communities, but separated from one another. This suggests that even at a national one-dose MMR coverage of 97.5%, thousands of children per cohort are not protected by herd immunity. Show less
Objective To facilitate patient disease subset and risk factor identification by constructing a pipeline which is generalizable, provides easily interpretable results, and allows replication by... Show moreObjective To facilitate patient disease subset and risk factor identification by constructing a pipeline which is generalizable, provides easily interpretable results, and allows replication by overcoming electronic health records (EHRs) batch effects. Material and Methods We used 1872 billing codes in EHRs of 102 880 patients from 12 healthcare systems. Using tools borrowed from single-cell omics, we mitigated center-specific batch effects and performed clustering to identify patients with highly similar medical history patterns across the various centers. Our visualization method (PheSpec) depicts the phenotypic profile of clusters, applies a novel filtering of noninformative codes (Ranked Scope Pervasion), and indicates the most distinguishing features. Results We observed 114 clinically meaningful profiles, for example, linking prostate hyperplasia with cancer and diabetes with cardiovascular problems and grouping pediatric developmental disorders. Our framework identified disease subsets, exemplified by 6 "other headache" clusters, where phenotypic profiles suggested different underlying mechanisms: migraine, convulsion, injury, eye problems, joint pain, and pituitary gland disorders. Phenotypic patterns replicated well, with high correlations of >= 0.75 to an average of 6 (2-8) of the 12 different cohorts, demonstrating the consistency with which our method discovers disease history profiles. Discussion Costly clinical research ventures should be based on solid hypotheses. We repurpose methods from single-cell omics to build these hypotheses from observational EHR data, distilling useful information from complex data. Conclusion We establish a generalizable pipeline for the identification and replication of clinically meaningful (sub)phenotypes from widely available high-dimensional billing codes. This approach overcomes datatype problems and produces comprehensive visualizations of validation-ready phenotypes. Show less
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is... Show moreThe power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters. Show less
Traumatic brain injury (TBI) is currently classified as mild, moderate, or severe TBI by trichotomizing the Glasgow Coma Scale (GCS). We aimed to explore directions for a more refined... Show moreTraumatic brain injury (TBI) is currently classified as mild, moderate, or severe TBI by trichotomizing the Glasgow Coma Scale (GCS). We aimed to explore directions for a more refined multidimensional classification system. For that purpose, we performed a hypothesis-free cluster analysis in the Collaborative European NeuroTrauma Effectiveness Research for TBI (CENTER-TBI) database: a European all-severity TBI cohort (n = 4509). The first building block consisted of key imaging characteristics, summarized using principal component analysis from 12 imaging characteristics. The other building blocks were demographics, clinical severity, secondary insults, and cause of injury. With these building blocks, the patients were clustered into four groups. We applied bootstrap resampling with replacement to study the stability of cluster allocation. The characteristics that predominantly defined the clusters were injury cause, major extracranial injury, and GCS. The clusters consisted of 1451, 1534, 1006, and 518 patients, respectively. The clustering method was quite stable: the proportion of patients staying in one cluster after resampling and reclustering was 97.4% (95% confidence interval [CI]: 85.6-99.9%). These clusters characterized groups of patients with different functional outcomes: from mild to severe, 12%, 19%, 36%, and 58% of patients had unfavorable 6 month outcome. Compared with the mild and the upper intermediate cluster, the lower intermediate and the severe cluster received more key interventions. To conclude, four types of TBI patients may be defined by injury mechanism, presence of major extracranial injury and GCS. Describing patients according to these three characteristics could potentially capture differences in etiology and care pathways better than with GCS only. Show less
Raz, Y.; Akker, E.B. van den; Roest, T.; Riaz, M.; Rest, O. van de; Suchiman, H.E.D.; ... ; Slagboom, P.E. 2020
Skeletal muscles control posture, mobility and strength, and influence whole-body metabolism. Muscles are built of different types of myofibers, each having specific metabolic, molecular, and... Show moreSkeletal muscles control posture, mobility and strength, and influence whole-body metabolism. Muscles are built of different types of myofibers, each having specific metabolic, molecular, and contractile properties. Fiber classification is, therefore, regarded the key for understanding muscle biology, (patho-) physiology. The expression of three myosin heavy chain (MyHC) isoforms, MyHC-1, MyHC-2A, and MyHC-2X, marks myofibers in humans. Typically, myofiber classification is performed by an eye-based histological analysis. This classical approach is insufficient to capture complex fiber classes, expressing more than one MyHC-isoform. We, therefore, developed a methodological procedure for high-throughput characterization of myofibers on the basis of multiple isoforms. The mean fluorescence intensity of the three most abundant MyHC isoforms was measured per myofiber in muscle biopsies of 56 healthy elderly adults, and myofiber classes were identified using computational biology tools. Unsupervised clustering revealed the existence of six distinct myofiber clusters. A comparison with the visual assessment of myofibers using the same images showed that some of these myofiber clusters could not be detected or were frequently misclassified. The presence of these six clusters was reinforced by RNA expressions levels of sarcomeric genes. In addition, one of the clusters, expressing all three MyHC isoforms, correlated with histological measures of muscle health. To conclude, this methodological procedure enables deep characterization of the complex muscle heterogeneity. This study opens opportunities to further investigate myofiber composition in comparative studies. Show less