Background Histopathological classification of Wilms tumors determines treatment regimen. Machine learning has been shown to contribute to histopathological classification in various malignancies... Show moreBackground Histopathological classification of Wilms tumors determines treatment regimen. Machine learning has been shown to contribute to histopathological classification in various malignancies but requires large numbers of manually annotated images and thus specific pathological knowledge. This study aimed to assess whether trained, inexperienced observers could contribute to reliable annotation of Wilms tumor components for classification performed by machine learning. Methods Four inexperienced observers (medical students) were trained in histopathology of normal kidneys and Wilms tumors by an experienced observer (pediatric pathologist). Twenty randomly selected scanned Wilms tumor-slides (from n = 1472 slides) were annotated, and annotations were independently classified by both the inexperienced observers and two experienced pediatric pathologists. Agreement between the six observers and for each tissue element was measured using kappa statistics (kappa). Results Pairwise interobserver agreement between all inexperienced and experienced observers was high (range: 0.845-0.950). The interobserver variability for the different histological elements, including all vital tumor components and therapy-related effects, showed high values for all kappa-coefficients (> 0.827). Conclusions Inexperienced observers can be trained to recognize specific histopathological tumor and tissue elements with high interobserver agreement with experienced observers. Nevertheless, supervision by experienced pathologists remains necessary. Results of this study can be used to facilitate more rapid progress for supervised machine learning-based algorithm development in pediatric pathology and beyond. Show less
Hermsen, M.; Volk, V.; Braesen, J.H.; Geijs, D.J.; Gwinner, W.; Kers, J.; ... ; Laak, J.A.W.M. van der 2021
Delayed graft function (DGF) is a strong risk factor for development of interstitial fibrosis and tubular atrophy (IFTA) in kidney transplants. Quantitative assessment of inflammatory infiltrates... Show moreDelayed graft function (DGF) is a strong risk factor for development of interstitial fibrosis and tubular atrophy (IFTA) in kidney transplants. Quantitative assessment of inflammatory infiltrates in kidney biopsies of DGF patients can reveal predictive markers for IFTA development. In this study, we combined multiplex tyramide signal amplification (mTSA) and convolutional neural networks (CNNs) to assess the inflammatory microenvironment in kidney biopsies of DGF patients (n = 22) taken at 6 weeks post-transplantation. Patients were stratified for IFTA development (<10% versus >= 10%) from 6 weeks to 6 months post-transplantation, based on histopathological assessment by three kidney pathologists. One mTSA panel was developed for visualization of capillaries, T- and B-lymphocytes and macrophages and a second mTSA panel for T-helper cell and macrophage subsets. The slides were multi spectrally imaged and custom-made python scripts enabled conversion to artificial brightfield whole-slide images (WSI). We used an existing CNN for the detection of lymphocytes with cytoplasmatic staining patterns in immunohistochemistry and developed two new CNNs for the detection of macrophages and nuclear-stained lymphocytes. F1-scores were 0.77 (nuclear-stained lymphocytes), 0.81 (cytoplasmatic-stained lymphocytes), and 0.82 (macrophages) on a test set of artificial brightfield WSI. The CNNs were used to detect inflammatory cells, after which we assessed the peritubular capillary extent, cell density, cell ratios, and cell distance in the two patient groups. In this cohort, distance of macrophages to other immune cells and peritubular capillary extent did not vary significantly at 6 weeks post-transplantation between patient groups. CD163(+) cell density was higher in patients with >= 10% IFTA development 6 months post-transplantation (p < 0.05). CD3(+)CD8(-)/CD3(+)CD8(+) ratios were higher in patients with <10% IFTA development (p < 0.05). We observed a high correlation between CD163(+) and CD4(+)GATA3(+) cell density (R = 0.74, p < 0.001). Our study demonstrates that CNNs can be used to leverage reliable, quantitative results from mTSA-stained, multi spectrally imaged slides of kidney transplant biopsies.This study describes a methodology to assess the microenvironment in sparse tissue samples. Deep learning, multiplex immunohistochemistry, and mathematical image processing techniques were incorporated to quantify lymphocytes, macrophages, and capillaries in kidney transplant biopsies of delayed graft function patients. The quantitative results were used to assess correlations with development of interstitial fibrosis and tubular atrophy. Show less
Hermsen, M.; Bel, T. de; Boer, M. den; Steenbergen, E.J.; Kers, J.; Florquin, S.; ... ; Laak, J.A.W.M. van der 2019
Background The development of deep neural networks is facilitating more advanced digital analysis of histopathologic images. We trained a convolutional neural network for multiclass segmentation of... Show moreBackground The development of deep neural networks is facilitating more advanced digital analysis of histopathologic images. We trained a convolutional neural network for multiclass segmentation of digitized kidney tissue sections stained with periodic acid-Schiff (PAS).Methods We trained the network using multiclass annotations from 40 whole-slide images of stained kidney transplant biopsies and applied it to four independent data sets. We assessed multiclass segmentation performance by calculating Dice coefficients for ten tissue classes on ten transplant biopsies from the Radboud University Medical Center in Nijmegen, The Netherlands, and on ten transplant biopsies from an external center for validation. We also fully segmented 15 nephrectomy samples and calculated the network's glomerular detection rates and compared network-based measures with visually scored histologic components (Banff classification) in 82 kidney transplant biopsies.Results The weighted mean Dice coefficients of all classes were 0.80 and 0.84 in ten kidney transplant biopsies from the Radboud center and the external center, respectively. The best segmented class was "glomeruli" in both data sets (Dice coefficients, 0.95 and 0.94, respectively), followed by "tubuli combined" and "interstitium." The network detected 92.7% of all glomeruli in nephrectomy samples, with 10.4% false positives. In whole transplant biopsies, the mean intraclass correlation coefficient for glomerular counting performed by pathologists versus the network was 0.94. We found significant correlations between visually scored histologic components and network-based measures.Conclusions This study presents the first convolutional neural network for multiclass segmentation of PAS-stained nephrectomy samples and transplant biopsies. Our network may have utility for quantitative studies involving kidney histopathology across centers and provide opportunities for deep learning applications in routine diagnostics. Show less
Geessink, O.G.F.; Baidoshvili, A.; Klaase, J.M.; Bejnordi, B.E.; Litjens, G.J.S.; Pelt, G.W. van; ... ; Laak, J.A.W.M. van der 2019