Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health and disease. Such large-scale... Show moreSingle-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health and disease. Such large-scale atlases increase the scale and generalizability of analyses and enable combining knowledge generated by individual studies. Specifically, individual studies often differ regarding cell annotation terminology and depth, with different groups specializing in different cell type compartments, often using distinct terminology. Understanding how these distinct sets of annotations are related and complement each other would mark a major step towards a consensus-based cell-type annotation reflecting the latest knowledge in the field. Whereas recent computational techniques, referred to as 'reference mapping' methods, facilitate the usage and expansion of existing reference atlases by mapping new datasets (i.e. queries) onto an atlas; a systematic approach towards harmonizing dataset-specific cell-type terminology and annotation depth is still lacking. Here, we present 'treeArches', a framework to automatically build and extend reference atlases while enriching them with an updatable hierarchy of cell-type annotations across different datasets. We demonstrate various use cases for treeArches, from automatically resolving relations between reference and query cell types to identifying unseen cell types absent in the reference, such as disease-associated cell states. We envision treeArches enabling data-driven construction of consensus atlas-level cell-type hierarchies and facilitating efficient usage of reference atlases. Show less
Does, A.M. van der; Mahbub, R.M.; Ninaber, D.K.; Rathnayake, S.N.H.; Timens, W.; Berge, M. van den; ... ; Faiz, A. 2022
Background: Despite the well-known detrimental effects of cigarette smoke (CS), little is known about the complex gene expression dynamics in the early stages after exposure. This study aims to... Show moreBackground: Despite the well-known detrimental effects of cigarette smoke (CS), little is known about the complex gene expression dynamics in the early stages after exposure. This study aims to investigate early transcriptomic responses following CS exposure of airway epithelial cells in culture and compare these to those found in human CS exposure studies. Methods: Primary bronchial epithelial cells (PBEC) were differentiated at the air-liquid interface (ALI) and exposed to whole CS. Bulk RNA-sequencing was performed at 1 h, 4 h, and 24 h hereafter, followed by differential gene expression analysis. Results were additionally compared to data retrieved from human CS studies. Results: ALI-PBEC gene expression in response to CS was most significantly changed at 4 h after exposure. Early transcriptomic changes (1 h, 4 h post CS exposure) were related to oxidative stress, xenobiotic metabolism, higher expression of immediate early genes and pro-inflammatory pathways (i.e., Nrf2, AP-1, AhR). At 24 h, ferroptosis-associated genes were significantly increased, whereas PRKN, involved in removing dysfunctional mitochondria, was downregulated. Importantly, the transcriptome dynamics of the current study mirrored in-vivo human studies of acute CS exposure, chronic smokers, and inversely mirrored smoking cessation. Conclusion: These findings show that early after CS exposure xenobiotic metabolism and pro-inflammatory pathways were activated, followed by activation of the ferroptosis-related cell death pathway. Moreover, significant overlap between these transcriptomic responses in the in-vitro model and human in-vivo studies was found, with an early response of ciliated cells. These results provide validation for the use of ALI-PBEC cultures to study the human lung epithelial response to inhaled toxicants. Show less
Aliee, H.; Massip, F.; Qi, C.C.; Biase, M.S. de; Nijnatten, J. van; Kersten, E.T.G.; ... ; INER-Ciencias Mexican Lung Program 2021
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells... Show moreThe recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years. Show less
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells... Show moreThe recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years. Show less
We describe a highly sensitive, quantitative, and inexpensive technique for targeted sequencing of transcript cohorts or genomic regions from thousands of bulk samples or single cells in parallel.... Show moreWe describe a highly sensitive, quantitative, and inexpensive technique for targeted sequencing of transcript cohorts or genomic regions from thousands of bulk samples or single cells in parallel. Multiplexing is based on a simple method that produces extensive matrices of diverse DNA barcodes attached to invariant primer sets, which are all pre-selected and optimized in silico. By applying the matrices in a novel workflow named Barcode Assembly foR Targeted Sequencing (BART-Seq), we analyze developmental states of thousands of single human pluripotent stem cells, either in different maintenance media or upon Wnt/beta-catenin pathway activation, which identifies the mechanisms of differentiation induction. Moreover, we apply BART-Seq to the genetic screening of breast cancer patients and identify BRCA mutations with very high precision. The processing of thousands of samples and dynamic range measurements that outperform global transcriptomics techniques makes BART-Seq first targeted sequencing technique suitable for numerous research applications. Show less
Benedetti, E.; Pucic-Bakovic, M.; Keser, T.; Wahl, A.; Hassinen, A.; Yang, J.Y.; ... ; Krumsiek, J. 2018
To elucidate the molecular basis of BMP4-induced differentiation of human pluripotent stem cells (PSCs) toward progeny with trophectoderm characteristics, we produced transcriptome, epigenome... Show moreTo elucidate the molecular basis of BMP4-induced differentiation of human pluripotent stem cells (PSCs) toward progeny with trophectoderm characteristics, we produced transcriptome, epigenome H3K4me3, H3K27me3, and CpG methylation maps of trophoblast progenitors, purified using the surface marker APA. We combined them with the temporally resolved transcriptome of the preprogenitor phase and of single APA+ cells. This revealed a circuit of bivalent TFAP2A, TFAP2C, GATA2, and GATA3 transcription factors, coined collectively the "trophectoderm four" (TEtra), which are also present in human trophectoderm in vivo. At the onset of differentiation, the TEtra factors occupy multiple sites in epigenetically inactive placental genes and in OCT4. Functional manipulation of GATA3 and TFAP2A indicated that they directly couple trophoblast-specific gene induction with suppression of pluripotency. In accordance, knocking down GATA3 in primate embryos resulted in a failure to form trophectoderm. The discovery of the TEtra circuit indicates how trophectoderm commitment is regulated in human embryogenesis. Show less