Wilms' tumors are pediatric malignancies that are thought to arise from faulty kidney development. They contain a wide range of poorly differentiated cell states resembling various distorted... Show moreWilms' tumors are pediatric malignancies that are thought to arise from faulty kidney development. They contain a wide range of poorly differentiated cell states resembling various distorted developmental stages of the fetal kidney, and as a result, differ between patients in a continuous manner that is not well understood. Here, we used three computational approaches to characterize this continuous heterogeneity in high-risk blastemal-type Wilms' tumors. Using Pareto task inference, we show that the tumors form a triangle-shaped continuum in latent space that is bounded by three tumor archetypes with "stromal", "blastemal", and "epithelial" characteristics, which resemble the un-induced mesenchyme, the cap mesenchyme, and early epithelial structures of the fetal kidney. By fitting a generative probabilistic "grade of membership" model, we show that each tumor can be represented as a unique mixture of three hidden "topics" with blastemal, stromal, and epithelial characteristics. Likewise, cellular deconvolution allows us to represent each tumor in the continuum as a unique combination of fetal kidney-like cell states. These results highlight the relationship between Wilms' tumors and kidney development, and we anticipate that they will pave the way for more quantitative strategies for tumor stratification and classification. Show less
The clinical notes in electronic health records have many possibilities for predictive tasks in text classification. The interpretability of these classification models for the clinical domain is... Show moreThe clinical notes in electronic health records have many possibilities for predictive tasks in text classification. The interpretability of these classification models for the clinical domain is critical for decision making. Using topic models for text classification of electronic health records for a predictive task allows for the use of topics as features, thus making the text classification more interpretable. However, selecting the most effective topic model is not trivial. In this work, we propose considerations for selecting a suitable topic model based on the predictive performance and interpretability measure for text classification. We compare 17 different topic models in terms of both interpretability and predictive performance in an inpatient violence prediction task using clinical notes. We find no correlation between interpretability and predictive performance. In addition, our results show that although no model outperforms the other models on both variables, our proposed fuzzy topic modeling algorithm (FLSA-W) performs best in most settings for interpretability, whereas two state-of-the-art methods (ProdLDA and LSI) achieve the best predictive performance. Show less