Immune signature drives leukemia escape and relapse after hematopoietic cell transplantation

Transplantation of hematopoietic cells from a healthy individual (allogeneic hematopoietic cell transplantation (allo-HCT)) demonstrates that adoptive immunotherapy can cure blood cancers: still, post-transplantation relapses remain frequent. To explain their drivers, we analyzed the genomic and gene expression profiles of acute myeloid leukemia (AML) blasts purified from patients at serial time-points during their disease history. We identified a transcriptional signature specific for post-transplantation relapses and highly enriched in immune-related processes, including T cell costimulation and antigen presentation. In two independent patient cohorts we confirmed the deregulation of multiple costimulatory ligands on AML blasts at post-transplantation relapse (PD-L1, B7-H3, CD80, PVRL2), mirrored by concomitant changes in circulating donor T cells. Likewise, we documented the frequent loss of surface expression of HLA-DR, -DQ and -DP on leukemia cells, due to downregulation of the HLA class II regulator CIITA. We show that loss of HLA class II expression and upregulation of inhibitory checkpoint molecules represent alternative modalities to abolish AML recognition from donor-derived T cells, and can be counteracted by interferon-γ or checkpoint blockade, respectively. Our results demonstrate that the deregulation of pathways involved in T cell-mediated allorecognition is a distinctive feature and driver of AML relapses after allo-HCT, which can be rapidly translated into personalized therapies. Post-transplantation relapse in acute myeloid leukemia patients without genomic loss of HLA is driven by transcriptional alterations in antigen presentation and T cell costimulation genes.


Transplantation of hematopoietic cells from a healthy individual (allogeneic hematopoietic cell transplantation (allo-
HCT)) demonstrates that adoptive immunotherapy can cure blood cancers: still, post-transplantation relapses remain frequent. To explain their drivers, we analyzed the genomic and gene expression profiles of acute myeloid leukemia (AML) blasts purified from patients at serial time-points during their disease history. We identified a transcriptional signature specific for post-transplantation relapses and highly enriched in immune-related processes, including T cell costimulation and antigen presentation. In two independent patient cohorts we confirmed the deregulation of multiple costimulatory ligands on AML blasts at post-transplantation relapse (PD-L1, B7-H3, CD80, PVRL2), mirrored by concomitant changes in circulating donor T cells. Likewise, we documented the frequent loss of surface expression of HLA-DR, -DQ and -DP on leukemia cells, due to downregulation of the HLA class II regulator CIITA. We show that loss of HLA class II expression and upregulation of inhibitory checkpoint molecules represent alternative modalities to abolish AML recognition from donor-derived T cells, and can be counteracted by interferon-γ or checkpoint blockade, respectively. Our results demonstrate that the deregulation of pathways involved in T cell-mediated allorecognition is a distinctive feature and driver of AML relapses after allo-HCT, which can be rapidly translated into personalized therapies.
The efficacy of allo-HCT in curing hematological malignancies strongly relies on transferring from the donor to the patient an immune system that is capable of eliminating residual tumor cells 1,2 . Still, relapses are frequent and, due to the lack of effective salvage therapies, represent the first cause of death in transplanted patients 3 .
Previous studies have documented that in partially incompatible allo-HCTs, genomic loss of the mismatched HLAs represents a frequent mechanism by which leukemia evades recognition from donor T cells and outgrows into clinically evident relapse [4][5][6][7] .
Here, we comprehensively assessed the genomic and transcriptional changes occurring at post-transplantation relapse in two independent cohorts of patients without genomic loss of HLA, documenting that immune-related changes are also prevalent in these patients and describing more patterns of immune evasion leading to recurrence.
We started by analyzing a discovery cohort of 40 adult patients transplanted for AML and for whom samples had been collected at the time of diagnosis, relapse after sole chemotherapy (available in three of the patients) and relapse after allo-HCT (Supplementary  Tables 1 and 2

NATuRE MEDICINE
were assessed by single nucleotide polymorphism (SNP) arrays, and transcriptional changes by genome-wide microarrays (Fig. 1a).
SNP profiling evidenced the appearance of new chromosomal insertions or deletions in 7 out of 12 post-transplantation relapses (Extended Data Fig. 1 and Supplementary Table 3) and of new copy-neutral loss-of-heterozygosity (CN-LOH) events in 2 of 12, always involving chromosome 13 and resulting in doubled allele burden of FLT3-internal tandem duplications (-ITD). FLT3-ITD allele burden also increased at post-transplantation relapse in three more patients by relative expansion of the mutated subclones (Supplementary Table 3).
Our observation of frequent de novo genomic alterations at post-transplantation relapse, in line with previous reports 5 , indicates that AML clonal evolution also continues after allo-HCT. However, the macro-alterations we identified here are clustered in well-known hotspots for AML both at diagnosis 8 and at relapse after chemotherapy 9 , indicating that they are unlikely to be linked to the immunological effects of allo-HCT, but rather to the clonal dynamics characteristic of tumors, canalized by the selective bottleneck imposed by the transplant conditioning regimen.
We next pairwise compared the gene expression profiles of AML blasts purified at diagnosis and at post-transplantation relapse, identifying a signature of 110 differentially expressed genes (DEGs) (Supplementary Table 4). Gene ontology analysis revealed that this signature was significantly enriched in immune-related genes (Fig. 1b). In-depth analysis of the three cases for which samples after sole chemotherapy were available evidenced that the immune-related changes were specific for post-transplantation relapses (Fig. 1c-e), possibly imprinted by the graft-versus-leukemia (GvL) effect.
To overcome the redundancy of gene ontology categorization, we assessed the semantic distance between the significantly deregulated biological processes using the GOSemSim Bioconductor R package 10 and identified two deregulated 'macro-clusters' encompassing genes linked to T cell costimulation and to antigen processing and presentation via HLA class II molecules (Fig. 1f). This result was confirmed by an independent bioinformatic analysis with the ClueGO package from Cytoscape 11 (Extended Data Fig. 2).
Given the deregulation of the T cell costimulation process highlighted in the previous analyses, we retrieved from the original dataset the expression levels of 32 genes known to be relevant in conveying activating or inhibitory signals to T cells and used them to create a heatmap representing the fold change in the expression of these genes between leukemia at diagnosis and its counterparts at relapse after chemotherapy or allo-HCT (Fig. 2a). This allowed us to appreciate the downregulation in post-transplantation relapses of multiple activatory ligands and adhesion molecules, including CD11A/LFA-1, and largely unchanged expression of inhibitory ligands, except for a modest increase in the expression of B7-H3.
However, since most of these genes are poorly covered by expression arrays, we further analyzed by immunophenotypic analysis AML blasts and T cells collected before and after allo-HCT from 33 discovery cohort patients (Supplementary Table 2).
After gating on leukemia blasts ( Supplementary Fig. 1), we assessed the positivity (Fig. 2b) and RFI (Fig. 2c) of 13 T cell ligands. We confirmed the changes in B7-H3 and CD11A levels detected by gene expression analysis, and documented in addition upregulation in relapsed leukemia of PD-L1 (significant in terms of percentage but not of RFI), PVRL2 and CD80.
Notably, the changes we observed in the expression profile of leukemia cells at post-transplantation relapse were mirrored by corresponding alterations in T cells (Fig. 2d). In particular, we observed that the percentage of T cells expressing PD-1 was significantly higher in AML patients before allo-HCT than in healthy controls, was similarly high in transplanted patients in remission and rose further at post-transplantation relapse. The percentage of T cells expressing PD-1 at post-transplantation relapse correlated significantly with the changes observed in the expression of PD-L1 between pre-and post-transplant analysis (Fig. 2e), hinting that upregulation of the ligand in leukemia blasts might have induced an exhausted phenotype in the corresponding T cells. For all the other receptors analyzed, except CTLA4, we documented significant differences between healthy individuals and patients at diagnosis, but either no change between the T cells at diagnosis and at relapse (for TIGIT, ICOS, OX40 and ICAM-1) or changes that were similarly observed in the lymphocytes of patients in remission after transplant (for DNAM-1 and CD28), indicating a principal role of the post-transplantation immune environment in driving their altered expression.
Taken together, these data demonstrate that the costimulatory interface between T cells and leukemia changes significantly after allo-HCT, with loss of costimulatory interactions (CD28/CD80, ICAM-1/ CD11A) and enforcement of inhibitory ones (PD-1/PD-L1).
To address the functional and translational relevance of these findings, we studied more in detail the case of a patient who relapsed 200 days after HCT with PD-L1 + leukemia (UPN 24). We observed that during the post-transplantation follow-up the increase in the expression of PD-1 on donor-derived T cells paralleled the rise of minimal residual disease markers and anticipated clinical relapse (Fig. 2f). In ex vivo coculture experiments, addition of an anti-PD-L1 blocking antibody increased both the proliferation (Fig. 2g) and the release of IFN-γ ( Fig. 2h) of donor-derived T cells against the patient leukemia blasts. Although this single patient observation should be considered cautiously, it suggests Fig. 1 | Immune-related changes in leukemia relapsing after allo-HCT. a, Outline of the experimental work-flow: peripheral blood and bone marrow samples were longitudinally collected from AML patients at the time of disease diagnosis, relapse after sole chemotherapy and relapse after allo-HCT. Leukemia blasts were purified from each sample by FACS-sorting and then subjected to genomic and transcriptional profile analysis. b, Histogram outlining the ten most significantly deregulated biological processes (classified by gene ontology terms) identified from the pairwise comparison of AML blasts collected and purified from discovery cohort patients at disease diagnosis and at relapse after allo-HCT (n = 9). The length of each bar is proportional to the significance of enrichment, calculated by a two-sided Fisher's exact test, with P values < 0.05 to the right of the dashed blue line. Dark bars denote immune-related biological processes. c-e, Histograms outlining biological processes identified as significantly deregulated from all pairwise comparisons performed for cases in which leukemia samples were available at disease diagnosis, at relapse after sole chemotherapy and at relapse after allo-HCT: a summary of the results obtained from the comparison between relapse after allo-HCT and disease at diagnosis (c), a summary of the results obtained from the comparison between relapse after allo-HCT and relapse after chemotherapy (d) and a summary of the results obtained from the comparison between relapse after chemotherapy and disease at diagnosis (e). The height of each bar is proportional to the significance of enrichment, calculated by a two-sided Fisher's exact test, with P< 0.05 above the dashed blue line. f, Heatmap representing the semantic similarity between all gene ontology terms identified as significantly deregulated (P < 0.05) from the pairwise comparison of AML blasts collected at disease diagnosis and at relapse after allo-HCT from discovery cohort patients (n = 9). Red indicates high similarity in gene content between two gene ontologies, blue indicates low similarity. Gene ontology terms are clustered according to their semantic distance, thus allowing the identification of significantly deregulated 'macroprocesses' (yellow/red squares in the heatmap), including one encompassing gene ontology terms linked to T cell costimulation (in green) and one encompassing gene ontology terms linked to antigen processing and presentation via HLA class II (in red).

Letters
NATuRE MEDICINE that at least in some patients with deregulation of the PD-1/PD-L1 axis, checkpoint blockade might restore a proficient GvL effect against the relapsed disease.
We next extracted from our dataset the expression of transcripts involved in peptide proteolysis and HLA presentation, selected from the Reactome database 12 . This analysis revealed the frequent and significant downregulation at post-transplantation relapse of almost all HLA class II transcripts and of their known regulator CIITA (Fig. 3a), which was confirmed by locus-specific quantitative PCR (Fig. 3b) and not explained by genomic alterations detectable by SNP arrays (Extended Data Fig. 1 Table 3). Comparative immunophenotypic analysis of leukemia blasts collected before and after allo-HCT confirmed loss of HLA-DR and -DP cell surface expression in 7/33 relapses (Fig. 3c,d). In all cases,

NATuRE MEDICINE
analyzed expression of HLA class I remained high at post-transplantation relapse (Extended Data Fig. 3). We showed in cytotoxicity assay (Fig. 3e), IFN-γ ELISpot (Fig. 3f) and antigen-specific activation assay (Fig. 3g) that T cells collected from a patient who experienced relapse with loss of HLA class II expression (UPN 10) responded to the leukemia at diagnosis, and not to its relapsed counterpart.
Taken together, these data indicate that not only genomic loss of a mismatched HLA haplotype, but also the transcriptional silencing of HLA class II molecules occur frequently in leuke-   for molecules known to exert an inhibitory effect on T cells (purple markers), activating them (blue markers) or able to mediate both effects, depending on the cognate receptor expressed by T cells (yellow markers). Transcript levels were assessed by microarrays, comparing leukemia at diagnosis with relapses after chemotherapy (CT, n = 3) or allo-HCT (allo-HCT, n = 9). Red and green indicate transcript upregulation and downregulation at relapse, respectively. Bars on the right side of the heatmap summarize average fold changes at post-transplantation relapse. b,c, Percentage of cell surface expression (b) and relative fluorescence intensity (RFI) (c) of inhibitory (in purple), activatory (in blue) or dual-functional (in yellow) ligands, assessed by immunophenotypic analysis of leukemia cells pairwise collected from discovery cohort patients before allo-HCT (red dots) and at post-transplantation relapse (blue dots) (n = 33). Shown are mean ± s.e.m.; P values were calculated by a two-sided Wilcoxon matched-pairs signed rank test at a 95% confidence interval (CI). d, Expression on the T cell surface of inhibitory (in purple) or activatory (in blue) receptors, assessed by immunophenotypic analysis of peripheral blood samples collected from healthy volunteers (in white, n = 10), from discovery cohort patients before allo-HCT (red dots) and at post-transplantation relapse (blue dots) (n = 33), and from patients in complete remission 2 months after allo-HCT (yellow dots, n = 10). Expression of CTLA4 is tested by intracellular staining and shown on CD4 + T cells. Shown are mean ± s.e.m.; P values were calculated by a two-sided unpaired or paired t-test at 95% CI, as appropriate. e, Correlation between the expression of PD-1 on T cells at post-transplantation relapse (x axis) and the change in expression of PD-L1 on AML blasts between diagnosis and relapse (y axis) in 25 evaluable sample pairs. Shown are results of two-sided Pearson correlation analysis, with linear regression line and 95% CI. f, Time-course of the expression of PD-1 on unique patient number (UPN) 24 T cells over time after allo-HCT (black circles) in relation to the levels of the NPM1 mutA and WT1 transcripts in the patient bone marrow (white circles and squares, respectively). The black arrow indicates the time of the hematological relapse. Each patient-derived sample has been analyzed once, with three technical replicates for the molecular analyses. g,h, T cells collected from UPN 24, 150 days after allo-HCT and 50 days before relapse (38% positive for PD-1 at the time of sampling) were tested for their ability to proliferate (g, showing the percentage of vital dye-diluting cells in each condition tested) and release IFN-γ (h, showing for each condition the number of IFN-γ spots detected from one out of three technical replicates) in response to patient PHA-stimulated lymphocytes (patient PHA), leukemia blasts collected from the same patient at post-transplantation relapse (41% positive for PD-L1, R-AML), the same blasts pre-incubated with an anti-PD-L1 blocking antibody (R-AML + aPD-L1) or the same blasts pre-incubated with a control isotype antibody (R-AML + Iso). Proliferation was assessed by dilution of the CellTrace Violet membrane dye, cytokine release by IFN-γ ELISpot assay. Shown are results from a single experiment.

NATuRE MEDICINE
mia relapses after allo-HCT, abrogating leukemia recognition by donor-derived T cells.
On the basis of the observation that in a patient-derived xenograft model HLA class II expression could be recovered on cross-recognition of minor histocompatibility antigens presented by leukemia blasts via HLA class I molecules and/or by xeno-reactivity against murine tissues (Extended Data Fig. 4), primary blasts from UPN 17 at post-transplantation relapse were cultured in vitro in the presence or absence of the same cytokines that were analyzed in the mouse sera, documenting that only IFN-γ was able to re-induce the surface expression of HLA-DR (Fig. 3h). Also, in the other cases of relapse with loss of HLA class II expression, exposure of leukemia cells to IFN-γ increased the expression of HLA class II (Fig. 3i,k) and rescued effective recognition of relapsed leukemia cells by donor-derived T cells (Fig. 3l).
By analyzing the expression of immune markers of interest in leukemia relapses after sole chemotherapy (Extended Data Fig. 5) and in non-malignant counterparts of AML (Supplementary Fig. 2 and Extended Data Fig. 6) we could demonstrate that the changes in T cell costimulation and antigen presentation that we documented occur only after allo-HCT, and only in leukemia cells.
To confirm the robustness of our findings, we analyzed by RNAseq and immunophenotypic profiling a validation set of 36 pre-and post-transplantation AML samples collected from seven different transplantation centers (Supplementary Tables 1 and 5). Despite the many differences between the two cohorts in terms of type of transplant, disease status and graft-versus-host disease (GvHD) prophylaxis, the 110-gene signature derived from our discovery series was also highly consistent in this validation set (Fig. 4a), as were the patterns and relative frequency of the transcriptional changes documented in genes involved in T cell costimulation and HLA class II presentation ( Fig. 4b and Extended Data Fig. 7).
To analyze the reciprocal interactions of the two newly identified modalities of relapse, the immunophenotypic datasets originated from the two cohorts were clustered on the basis of profile similarity (Fig. 4c,d). This allowed to clearly distinguish in both cohorts a subset of patients with downregulation of HLA class II molecules (top clusters, 45% of cases in the discovery set and 39% in the validation set), and a second macro-cluster characterized by minor changes (middle sub-clusters) or upregulation (lower sub-clusters) of HLA class II molecules, accompanied by more evident upregulation of inhibitory markers, including PD-L1. Concordantly, high-dimensional analysis identified relapse-specific clusters with lower HLA-DR expression and increased PD-L1 expression as compared to their diagnosis-specific counterparts (Extended Data Fig. 8). Correlation analysis evidenced that inhibitory ligands were mostly upregulated in cases with conserved or increased expression of HLA-DR and -DP (Fig. 4e,f).
We finally integrated our results regarding immune-related changes with known clinical and immunogenetic variables. The distribution of patients among relapse modalities did not correlate with most of the features analyzed, including cytogenetics, leukemia driver mutations and donor-recipient HLA matching. The only variables that showed a significant correlation with HLA class II downregulation at relapse were the use of a peripheral blood stem cell graft and the dose of infused T cells, consistent in each patient cohort and reaching statistical significance only when the two were merged (Extended Data Fig. 9).
The results from the present study complement and reinforce previous work on genomic HLA loss 4,6,7 in supporting the hypothesis that post-transplantation relapses might frequently represent the end-result of mechanisms enacted by leukemia cells to evade immune control, and that immune pressure might substantially deviate the trajectory of leukemia clonal evolution 13,14 . b, mRNA expression levels of HLA-DRB, HLA-DPB1 and CIITA measured by locus-specific quantitative PCR in leukemia blasts purified from discovery cohort patients at diagnosis (red dots) and at post-transplantation relapse (blue dots) (n = 7). Shown are mean ± s.e.m.; P values were calculated by a two-sided Wilcoxon matched-pairs signed rank test at 95% CI. c,d, Expression on the cell surface of leukemia blasts of HLA-DR and HLA-DP molecules, assessed by immunophenotypic analysis in samples pairwise collected from discovery cohort patients before allo-HCT (red dots) and at post-transplantation relapse (blue dots) (n = 33) (c). Shown are mean ± s.e.m.; P values calculated by a two-sided Wilcoxon matched-pairs signed rank test at 95% CI. Green lines link pre-and post-transplantation assessments in the seven patients for whom histogram plots are shown in d. Gray histograms in d represent fluorescenceminus-one (FMO) controls of AML blasts at diagnosis. For each histogram, the percentage displayed refers to the comparison with the relevant FMO control. All of these samples have been analyzed at least twice with similar results. e-g, T cells collected from UPN 10, 310 days after allo-HCT and 30 days before relapse were stimulated with leukemia blasts collected from the same patient at diagnosis, and tested by standard 4-h chromium release cytotoxicity assay (e, showing one out of three technical triplicates from one of out of two independent experiments), by IFN-γ ELISpot assay (f, showing for each condition the number of IFN-γ spots detected from one out of two or three technical replicates from one out of two independent experiments) and by CD137/4-1BB upregulation assay (g, showing the percentages of CD137 + CD4 T cells in each condition tested, calculated using as reference for gating the spontaneous expression in CD4 T cells cultured in medium alone; shown are results from one out of two independent experiments). Targets for the cytotoxicity assay were patient leukemia cells collected at diagnosis (red line) and at post-transplantation relapse (blue line). Targets for IFN-γ ELISpot and CD137/4-1BB assays were donor autologous PHA-stimulated lymphocytes (auto), the patient leukemia cells at diagnosis incubated with medium alone (D-AML), with anti-HLA-DR and anti-HLA-DP blocking antibodies (D-AML+aDR-DP) or with an isotype control antibody (D-AML+Iso), and the patient leukemia cells at post-transplantation relapse (R-AML). h, Leukemia blasts collected from UPN 17 at post-transplantation relapse were tested for recovery of HLA-DR expression after 72 h of culture in the presence of medium alone or supplemented with human cytokines. The gray histogram represents the FMO control of the sample cultured in medium alone. Shown are results from a single experiment. i, Fold change in HLA class II transcripts on IFN-γ treatment of AML post-transplantation relapses from UPN 14 (triangle), 10 (circle), 17 (square) and 37 (diamond). Leukemia cells were incubated for 3 days with IFN-γ or medium alone. Numbers above graphs indicate P obtained from two-sided Wilcoxon matched-pairs signed rank test at 95% CI. j,k, Dot plots represent the percentage of leukemia blasts expressing HLA-DR (j) and HLA-DP (k), assessed by immunophenotypic analysis of samples collected from five patients at diagnosis (D-AML) and at relapse after allo-HCT (R-AML). Relapse samples were also tested after 7 days of culture in the presence or absence of human IFN-γ. Shown are mean ± s.d.; P values were calculated by one-way analysis of variance (ANOVA) with Bonferroni correction posttest. l, T cells collected from UPN 10, 310 days after allo-HCT and 30 days before relapse were stimulated with patient leukemia blasts collected from the same patient at time of diagnosis, and tested for expression of the activation marker CD137/4-1BB on the cell surface of CD4 + cells on rechallenge with patient leukemia cells at diagnosis (D-AML), patient leukemia cells at relapse (R-AML) or the same relapsed leukemia cultured for 7 days in the presence of human IFN-γ (R-AML+IFN-γ). Shown are the percentages of CD137 + CD4 T cells in each condition tested, calculated using as reference for gating the spontaneous expression of CD137 in CD4 + T cells cultured in medium alone. Shown are results from one out of two independent experiments.

NATuRE MEDICINE
Whereas the previously described mechanism of HLA loss relied on large-scale genomic alterations 4 , here we observed transcriptional changes that were not explained by genomics, indicating they might have a primarily epigenetic origin. This intriguing hypothesis is supported by recent studies highlighting the extents of AML epigenetic clonal evolution 15 , by the preclinical evidence of immune-related effects of epigenetic drugs 16,17 and by the promising results achieved by these same therapies against post-transplantation relapse 18,19 .
Several previous reports have shown that AML blasts can express multiple inhibitory ligands to dampen T cell recognition [20][21][22][23] . Here we show that this feature becomes even more prevalent at relapse after allo-HCT. Despite some very promising preliminary results 24,25 , n = 15). Size of the dots is proportional to the significance of the deregulation in the validation dataset. Shown are the results of two-sided Pearson correlation analysis. b, Scatter plot displaying the fold changes in the levels of transcripts from genes known to be involved in T cell costimulation (in cyan) or HLA class II presentation (in light orange) detected by microarray analysis of paired diagnosis-relapse samples from the discovery cohort (x axis, n = 9) and by RNA-seq analysis of paired diagnosis-relapse samples from the validation cohort (y axis, n = 15). Size of the dots is proportional to the significance of the deregulation in the validation dataset. Shown are the results of two-sided Pearson correlation analysis. c,d, The datasets generated from the immunophenotypic analysis of the full panel of antigen presentation and T cell costimulation molecules in the discovery cohort (n = 33, c) and in the validation cohort (n = 36, d) were employed to generate two heatmaps of fold changes in the expression of each marker. Unsupervised clustering was then performed to group together patients with similar patterns of phenotypical changes. By ANOVA analysis, HLA-DR and -DP resulted as the most significant contributors in determining the clustering. Red indicates surface marker upregulation at relapse, green indicates surface marker downregulation at relapse. Light gray squares indicate markers not tested in the relevant patient. e,f, Scatter plots displaying the correlation between changes in the surface expression of HLA class II molecules from diagnosis to relapse (on the x axis, expressed as average Z score) and changes in the overall expression of inhibitory ligands in the same leukemia sample pair (on the y axis, expressed as average Z score) in the discovery cohort (n = 33, e) and in the validation cohort (n = 36, f). Colors of the dots relate to the three patient clusters indicated in c and d. Shown are results of two-sided Pearson correlation analysis. The blue line represents the best fit regression line and the light blue area indicates the 95% CI.

NATuRE MEDICINE
in the context of allo-HCT has been restrained by concerns regarding the risk of inducing severe GvHD 26,27 . Our results indicate that in selected patients checkpoint blockade may be highly effective in restoring the GvL effect, although the multiplicity of different patterns of co-expression of inhibitory ligands and the observation that most changes are subtle quantitative deregulations indicated that combinatorial therapy will probably be required to achieve clinically meaningful results.
The second relapse modality identified in this study depends on the downregulation of HLA class II molecules from the surface of leukemia blasts. Analogously to genomic HLA loss 6 , this mechanism also appeared to be directly correlated with the dose of T cells infused with the graft. However, different from its genomic counterpart, loss of HLA class II expression did not correlate with the number of donor-recipient HLA incompatibilities and also occurred after HLA-matched HCTs, where it may favor immune evasion by dramatically narrowing the antigenic repertoire presented to donor T cells. In the single case in which functional ex vivo validation was possible, loss of HLA class II molecules abrogated recognition of leukemia by donor T cells, supporting previous studies on the role of incompatible HLA class II molecules as preferential targets of GvL 28,29 , and pointing to a central role of CD4 + T cells in posttransplantation leukemia immunosurveillance.
Of note, the mechanisms we described sit on two opposite poles of interferon-mediated responses: while in fact IFN-γ can rescue the surface expression of HLA class II molecules on relapsed leukemia, it is also known to promote the expression of PD-L1 and other inhibitory ligands [30][31][32] . This dichotomy has a translational impact, since promoting IFN-γ systemic release, such as by inducing non-severe chronic GvHD, might be beneficial to patients in which relapse depends on the loss of HLA expression and detrimental to those in which relapse is driven by the enhancement of inhibitory axes.
Including cases with genomic loss of HLA, we can now recognize a defined immune pattern in more than two-thirds of relapses, each targetable by a precise salvage therapy: re-transplantation from a different donor for patients with genomic HLA loss 6,33,34 , checkpoint blockade for those with upregulation of inhibitory ligands and establishment of mild chronic GvHD for those with loss of HLA class II expression. Recent data have shown that, even in cases in which relapse depends on the FLT3-ITD oncogene amplification, it is possible to exploit the transplanted immune system in a tailored approach 35 . Whereas each of these strategies may yield mixed results when employed without a clear patient selection criterion, their efficacy can be substantially improved when it is based on a specific biological rationale, fulfilling the logic of personalized medicine.

Online content
Any methods, further references, Nature Research reporting summaries, source data, statements of data availability and associated accession codes are available at https://doi.org/10.1038/ s41591-019-0400-z.

Study design: patient and transplant characteristics.
In this retrospective study, on the basis of material availability, we selected a discovery cohort (n = 40) and a validation cohort (n = 36) of patients with a diagnosis of de novo or secondary AML who experienced non-HLA loss disease relapse after allo-HCT, and for whom paired pre-and post-transplant viable leukemic samples had been collected on specific written informed consent, in agreement with the Declaration of Helsinki. Genomic DNA and total RNA were extracted using the Qiagen Mini Blood Kit (Qiagen) and the Trizol reagent (Invitrogen, used for samples analyzed by microarrays) or RNeasy Plus Mini or Micro Kits (Quiagen, used for samples analyzed by RNA-seq), respectively, following the manufacturer's indications. Nucleic acid quantification was performed using a Nanodrop-2000c spectrophotometer (Thermo Fisher Scientific), and quality control was performed taking advantage of the Agilent Bioanalyzer technology or the RNA Screen Tape System (both from Agilent).

SNP array and data analysis.
Genomic profiling was performed using the Illumina Human 660W-Quad BeadChip array (Illumina), capable of assessing over 550,000 tag SNPs plus 100,000 further markers (targeting regions of common copy number variations). A total of 200 ng per sample of high-quality DNA were amplified, tagged and hybridized according to the manufacturer's protocol. The array slides were scanned on an iScan Reader (Illumina). The estimations of log 2 Ratio (log 2 R) and B-allele frequency (BAF) were generated using default settings of the GT module of GenomeStudio v.9.4 (Illumina), taking into consideration the HapMap control set provided by the manufacturer. On the basis of the log 2 R and BAF values, Nexus Copy Number 5.0 (BioDiscovery) with default settings was used for calling copy number variation and/or LOH events, with the UCSC built Hg18-Human Mar. 2006 (NCBI36/hg18) assembly as reference. Raw data tables from GenomeStudio were then imported and processed by the DNAcopy R package for segmentation analysis estimation 36 . The log 2 R, BAF and the segmentation data were converted into a BED graph and segment files, and alterations previously identified were confirmed by visual inspection on IGV v.2.4.8 (ref. 37 ). Supplementary Table  3 lists all duplications, deletions and CN-LOHs larger than 0.1 Mb detected by our analysis, together with estimates of clonality of the alterations, through a method adapted from Paulsson et al. 38 .
Gene expression profile array and data analysis. Gene expression profiling for samples belonging to the discovery set was performed using the Illumina HumanHT-12 v.3.0 or v.4.0 Expression BeadChips array (Illumina), covering up to 35,000 annotated genes with more than 48,000 probes derived from the National Center for Biotechnology Information Reference Sequence (NCBI) RefSeq (Build v.36.2, Release 22) and the UniGene (Build 1999) databases. A first step of RNA amplification generating biotinylated, amplified RNA for hybridization was performed with the Illumina TotalPrep RNA Amplification Kit (Life Technologies), according to previously published protocols and to the manufacturer's recommendations 39 . Briefly, up to 200 ng of total RNA was reverse transcribed into complementary DNA with T7 Oligo(dT) primers, and the double stranded cDNA was then in vitro transcribed to synthesize cRNA using a biotin-NTP mix. The resulting cRNA was quantified on a Nanodrop-2000c spectrophotometer (Thermo Fisher Scientific), and the quality was assessed on a Agilent Bioanalyzer chip (Agilent). A total of 250 ng of cRNA were then hybridized to the BeadChip at 58 °C overnight and the fluorescent signal was developed with streptavidin-Cy3, followed by quantitative detection of fluorescence emission by the array Illumina iScan scanner and computation by the Illumina GenomeStudio software (both from Illumina). Gene expression data were normalized using the quantile algorithm implemented in the Illumina GenomeStudio software.
Among the 30,000 genes assessed in the microarray, a matrix of expressed genes was generated by selecting all transcripts with intensity values that differed significantly from background (detection P < 0.01) in at least one sample of the entire series. The LIMMA Bioconductor package was used to extract the DEGs considering a factorial design model and pairwise comparisons 40 . A post-test was used to select putative DEGs in each contrast, under the Benjamini-Hochberg multiple comparison correction. Genes with an adjusted P< 0.1 were considered differentially expressed.
A gene ontology enrichment analysis for biological processes was performed with the DAVID v.6.7 online interface 41,42 , using default parameters and considering as background all the genes expressed in our dataset and as genes of interest those differentially expressed in each specific contrast. For each comparison, P < 0.05 was used to define a significantly enriched biological process. The GOSemSim package release 2.12 (ref. 10 ) and the ClueGO package from Cytoscape 11 were used to cluster together related gene ontology biological processes.
RNA-seq and data analysis. Gene expression profiling for samples belonging to the validation set was performed using RNA-seq, starting from 300 ng of total RNA and using the TruSeq Stranded mRNA library preparation kit (Illumina) in accordance with low-throughput protocol. After PCR enrichment (15 cycles) and purification of adapter-ligated fragments, the concentration and length of DNA fragments were measured using D1000 Screen Tape System (Agilent), obtaining a median insert size of 311 nucleotides. Then, RNA-seq libraries were sequenced using the Illumina Next-Seq 500 high platform to obtain a minimum of 30 × 10 6 paired-end reads per sample.
To quantify gene expression levels, read tags were pseudoaligned to gencode v.28 transcripts 43 using kallisto v.0.44.0 44 (parameters: -t 8-single-rf-stranded -l 200 -s 20). Abundancies were summarized to genes using txImport package 45 and analyzed using edgeR 46 with a design matrix including sample identity, disease condition and coordinates of the first two components of multidimensional scaling of the count matrix (~sample_id + condition + X + Y). Fold change was then calculated on the 'condition' covariate.
Quantitative PCR quantification of gene transcripts. qPCR assays for the quantification of HLA-A, HLA-C, HLA-DRB, HLA-DPB1 and CIITA transcripts were developed in house or adapted from previous studies [47][48][49] . For all reactions, 250-500 ng of total RNA were retro-transcribed with the High Capacity cDNA Reverse Transcription Kit (Applied Biosystems), using random hexamers and RNase Inhibitor. Gene expression levels were measured by real-time quantitative PCR (RT-qPCR) on an ABI 7500 Real-Time PCR System (Applied Biosystems) using the Sybr Green chemistry (Applied Biosystems), using the following thermal cycler conditions for all reactions: 1 cycle 95 °C for 10 min, followed by 40 cycles at 95 °C for 10 s, and at 60 or 63 °C for 35 s, ending with 15 °C incubation. The nucleotide sequences of primers used for RT-qPCR are provided in Supplementary  Table 6. The ΔΔC T method was used to define gene expression levels using RNaseP as the reference gene and the ΔC T mean of each transcript at diagnosis as a reference sample value. For all the assays, efficiency was confirmed to be superior to 80% by serial dilutions of the template in water, and specificity was validated by in silico analysis and Sanger sequencing of the amplification products.

Multiparametric flow cytometry.
To assess their phenotype, mononuclear cells derived from the peripheral blood or bone marrow of patients and healthy individuals were stained with human fluorochrome-conjugated monoclonal antibodies (mAb). The complete list of mAbs used in the study is provided as part of the Life Sciences Reporting Summary.
For immunophenotipic analysis samples were thawed in complete medium. A total of (0.2-1) × 10 6 cells were washed in 1× PBS, 2% FBS and then labeled with the appropriate antibodies mix in 50 μl 1× PBS, 2% FBS in a FACS tube. Staining was performed at 4 °C for 15-20 min, followed by washing with 1 ml of 1× PBS, 2% FBS before the analysis. For staining of intracellular antigens (CTLA4), after surface staining, cells were fixed and permeabilized using the BD Cytofix/ Cytoperm fixation and permeabilization solution kit (BD Biosciences).
For data analysis, a first logical gate was based on side scatter and CD45 intensity, followed by a second gate on the patient-specific LAIP to identify leukemia cells and on CD3 for T cells. At least 200,000 events in the live cells gate were acquired per sample. Analysis of human samples was performed using a Canto II flow cytometer equipped with 405, 488 and 633 nm lasers, or using a LSR Fortessa flow cytometer equipped with 355 nm, 405, 488, 561 and 640 nm lasers (both from BD Biosciences). Analysis of mouse samples was performed using a Gallios flow cytometer equipped with 488, 638 and 405 nm lasers (Beckman Coulter). Each acquisition was calibrated using Rainbow Calibration Particles (Spherotech) to correct for day-to-day laser intensity variations. Data were processed using FCS 4 Express (De Novo Software), FlowJo v.9.8.5 (Tree Star) or Kaluza (Beckman Coulter). Representative plots with gating strategy and controls are shown in Supplementary Figs. 1 and 2.

NATuRE MEDICINE
Differences in percentages were used to calculate sample clustering, using correlation coefficient as distance measure and Ward's method for agglomerative clustering. An ANOVA F-value was used to rank the relative influence of each marker in determining clusters. Heatmaps were elaborated using the pheatmap package 50 .
High-dimensional analysis of flow cytometry data from the entire validation sample cohort (72 samples, 36 from diagnoses and 36 from relapses) was performed using the Barnes-Hut-stochastic neighbor embedding (BH-SNE) dimensionality reduction algorithm downscaling each sample to 5,000 CD3 − events. BH-SNE was visualized on MATLAB (MathWorks, Inc.) by the plug-in 'cyt' , developed by D. Pe' er's laboratory 51 . BH-SNE algorithm analysis settings were perplexity = 30.00 and theta = 0.5, as suggested by the developers. The K-means algorithm was directly applied on BH-SNE1 and BH-SNE2 variables, converging in approximately 190 interactions and dividing the bi-axial plot into 200 discrete areas (meta-clusters) on the basis of the mean fluorescence intensity of the HLA-DR, HLA-DP, PD-L1, B7-H3 and Vista markers. Each cluster was then studied for their composition, and appointed to either the 'relapse' or the 'diagnosis' subgroup if more than 75% of those events fell specifically in one of the two groups.
Mixed lymphocyte cultures (MLCs). Peripheral blood mononuclear cells (PBMCs) collected from patients with full donor chimerism after allo-HCT and before clinical relapse were stimulated with the respective patient irradiated leukemic blasts at a responder/stimulator ratio of 1/1. Cells were cultured in Iscove's Modified Dulbecco's Media supplemented with 1% glutamine (G), 1% penicillin/streptomycin (P/S), 10% HS and IL-2 at a final concentration of 150 UI ml −1 . IL-2 was replaced every 3-4 days and responders were re-stimulated every 7 days. PBMCs from an HLA-disparate healthy individual were tested in parallel as a positive control of the ability of leukemic cells to induce an alloresponse. After each round of stimulation, donor-derived T cells from the MLCs were characterized by flow cytometry for their phenotype, and tested against targets of choice for antigen-specific activation, cytokine release, proliferation and cytotoxicity.
Functional assays for T cell recognition of target cells. Briefly, T cell activation in response to the relevant target cells was tested on 24-h co-incubation with the targets of interest using the 4-1BB (CD137) upregulation assay 52 . CD4 and CD8 T cell populations were identified by excluding target cells, marked by the CellTrace Violet Cell Proliferation dye (Invitrogen) and staining the samples with CD45 (Clone HI30), CD3 (Clone SK7), CD4 (Clone OKT4 or SK3)-all from BioLegend, CD8 (Clone SK1) and CD137 (Clone 4B4-1), both from BD Biosciences. Each condition was assessed in duplicates and pooled before flow cytometry analysis to acquire at least 20,000 effector cells per condition.
Cytokine release was tested by means of IFN-γ ELIspot assay. Briefly, 5 × 10 4 responder T cells from the MLCs were re-challenged overnight at 37 °C in 5% CO 2 with 5 × 10 4 γ-irradiated targets in 200 μl of complete medium. Primary leukemic blasts were irradiated at 30 Gy. Spots were counted by a KS Elispot Reader (Zeiss). All conditions were assessed, at least in duplicates.
Cytotoxicity was measured by standard 4-h 51 Cr release assay, testing different effector/target (E/T) ratios. After 4 h co-incubation of responder T cells with the 51 Cr labeled targets of interest, the supernatants were collected and analyzed using a γ-counter. Specific lysis was expressed according to the formula: 100 × (average experimental cpm − average spontaneous cpm)/(average maximum cpm − average spontaneous cpm).
For mAb blocking experiments, target cells were pre-incubated for 1 h at room temperature before addition to the functional assays described above. For in vivo experiments, the T cells were depleted from the leukemia samples by column selection (Miltenyi Biotec) taking advantage of the human CD3 microbeads (Miltenyi Biotec). At least 1 × 10 6 CD3-depleted primary human leukemia cells collected at diagnosis or relapse after allo-HCT were engrafted into 4-week-old non-irradiated immunodeficient NOD scid gamma (NSG) mice. Human chimerism in the peripheral blood of the mice was assessed twice a week by flow cytometry, evaluating the counts per μl by addition of count beads into each sample (Beckman Coulter). A first gate was set to discriminate between cells positive for mouse or human CD45 and, among human CD45-positive cells, the absolute counts of leukemia blasts and T cells were quantified on gating on the patient-specific LAIP or on CD3 + T cells, respectively.
Donor T cells were expanded from PBMCs on in vitro stimulation with anti-CD3/CD28-conjugated magnetic beads (Dynabeads ClinExVivo CD3/CD28; or human CD3/CD28 (Invitrogen) in a bead/T cell ratio of 3/1 with the addition of IL-7 and IL-15 at 5 ng ml −1 each (PeproTech) 53 . Cytokines and medium were replaced every 3-4 days. After 2 weeks of stimulation, T cell cultures were stored in nitrogen until infusion into mice. On the appearance of human AML blasts in the peripheral blood (threshold set at 25 leukemic cells per μl), the relevant groups of mice were treated with a single infusion of human T cells from the respective allogeneic HCT donor.
Each experiment included at least three mice per group. For immunophenotipic analysis a total of 50 μl mice peripheral blood, previously treated with heparin, were stained with the relevant mixture of antibodies with the addition of 100 μl of 1× PBS, 2% FBS, for 20 min at 4 °C. The erythrocytes were eliminated from the samples by incubating with 3-5 ml ammonium chloride potassium lysis buffer for 4 min at room temperature. Cells were pelleted by centrifugation (300g for 10 min) and washed with 3 ml of 1× PBS and then re-suspended in 50 μl of 1× PBS and 25 μl of count beads. Mice sera were collected before and after the infusion of donor T cells (once a week) and stored at −20 °C before evaluation of human cytokines (see Extended Data Fig. 4a for the experimental outline).
Quantification of human cytokines in mouse sera. Human Th1 cytokines (IL-2, IL-6, IL-10, IFN-γ and TNF-α) were quantified in murine sera using Human LegendPlex 5-plex (BioLegend) according to the manufacturer's instructions. For each time-point and each experimental condition the sera of three biological replicates were analyzed. For each studied cytokine, a high-sensitivity standard curve was generated by serial dilutions of recombinant proteins. Data were analyzed using LEGENDplex v.7.0 (BioLegend).
Analysis of leukemia driver mutations. FLT3-ITD status and allele burden was determined by PCR followed by capillary electrophoresis using 100 ng of genomic DNA extracted from purified leukemic blasts, as described by Jilani and collaborators 54 . Analysis of mutations in NPM1 and quantification of transcripts of NPM1 mutation A and WT1 were performed using the methods described by Brambati and collaborators 55 .

Statistical analyses.
For all relevant comparisons, after testing for normal distribution through the Kolmogorov-Smirnov test, comparative analyses between two groups were performed, as appropriate, by two-sided paired or unpaired Student's t-tests at 95% CI. In case of not-normally distributed data, the Wilcoxon matched-pair signed rank test at 95% CI was used. A P < 0.05 was set as threshold for significance. If more than two groups were tested, a one-way ANOVA test with the Bonferroni correction was used. All statistical analyses were carried out using the GraphPad Prism v.7.0a software.
Animal numbers were chosen according to the variability observed in pilot experiments and on the basis of leukemia cell availability. In all in vivo experiments, at least three biological replicates per group were tested.
Regression analysis between relapse modalities and known clinical variables was performed through a univariate model, calculating for each variable of interest the odds ratio with the associated 95% CI.

NATuRE MEDICINE
Extended Data Fig. 3 | Expression of HLA class I molecules at post-transplantation relapse. a, Heatmap representing fold expression changes in HLA class I gene transcripts (fuchsia markers), their regulators (purple markers) and accessory molecules involved in HLA class I presentation (teal markers). Transcript levels were assessed by microarrays, comparing leukemia at diagnosis with relapses after chemotherapy (CT, n = 3) or allo-HCT (allo-HCT, n = 9). Red and green indicate transcript upregulation and downregulation at relapse, respectively. Bars on the right side of the heatmap summarize mean fold changes at post-transplantation relapse. b, mRNA expression levels of HLA-A and -C measured by locus-specific qPCR in leukemia blasts pairwise collected and purified from patients at diagnosis (red dots) and at post-transplantation relapse (blue dots) (n = 7). Dots indicate values from single patients, lines indicate mean ± s.e.m. P values were calculated by a two-sided Wilcoxon matched-pairs signed rank test at 95% CI. c, HLA class I cell surface expression by leukemia blasts, assessed by immunophenotypic analysis in samples pairwise collected from discovery series patients before allo-HCT (red dots) and at post-transplantation relapse (blue dots) (n = 33). Dots indicate values from single patients, lines indicate mean ± s.e.m. P values were calculated by a two-sided Wilcoxon matched-pairs signed rank test at 95% CI.

NATuRE MEDICINE
Extended Data Fig. 8 | High-dimensional analysis of immunophenotypic data obtained from the validation cohort. a, Color maps obtained using the BH-SNE bioinformatic algorithm for single-cell analysis, allowing the visualization in a two-dimensional map of complex datasets of high-dimensional objects (in this case, single cells stained with 16 different fluorochromes), plotted in the map on the basis of their reciprocal similarity. Shown are maps obtained from the full dataset of immunophenotypic analyses performed in our validation cohort, encompassing all the events registered in the analysis of paired diagnosis-relapse samples from the validation cohort (n = 36). The BH-SNE map relative to expression of HLA-DR, HLA-DP, PD-L1, B7-H3 and Vista was colored to evidence the differential positioning (and consequently phenotypic dissimilarity) of events originating from diagnosis samples (in red, left panel) or relapse samples (in blue, right panel). b, On the basis of K-means analysis of the BH-SNE map, meta-clusters of events unique for diagnoses (n = 19) and relapses (n = 4) were identified, and the mean fluorescence intensity of the markers characterizing them are plotted in red and blue, respectively. P values were calculated by a two-sided unpaired t-test at 95% CI.

NATuRE MEDICINE
Extended Data Fig. 9 | Clinical and immunogenetic correlates of HLA class II downregulation at post-transplantation relapse. Forest plot represents the odds ratio (diamonds) and 95% CI (error bars) of belonging to the 'HLA class II downregulation' clusters identified in Fig. 4c, d, calculated in the entire study population (n = 69) using an univariate logistic regression model for demographic, disease-related, immunogenetic and transplant-related variables. *These variables were considered as continuous in the model. § Considering allelic mismatches in the graft-versus-host direction in HLA-A, -B, -C and -DRB1.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability SNP array, microarray and RNA-Seq data generated and analysed during the current study are available through ArrayExpress (https://www.ebi.ac.uk/arrayexpress/) with accession numbers E-MTAB-7631, E-MTAB-7628, E-MTAB-7630 and E-MTAB-7456. All the other relevant data generated and/or analysed in the current study are included in the manuscript or in supplementary information.

Field-specific reporting
Please select the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences
For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf

Life sciences Study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
All patient samples coming from the Biobank of the different centres had been collected upon specific written consent in agreement with the Declaration of Helsinki. Selection of samples of interest was operated on the basis of availability of viable samples containing at least 5% leukemic blasts pairwise collected before and after allogeneic HSCT and no evidence of genomic HLA loss at relapse. For the high-throughput analyses also the expected yield of an adequate amount of purified leukemic blasts upon sorting was considered a criterion for sample choice. All samples meeting these requirements were tested, and sample size was determined by their availabilityand considered adequate upon analysis of the interindividual variability in the variables of interest. The study was approved by the scientific and ethic committee of the San Raffaele Hospital Scientific Institute (Milan, IT). For in vivo experiments, number of animals was selected based on variability observed in pilot experiments and on primary leukemia cells availability. All in vivo experiments performed were approved by the San Raffaele Animal Care and Use Committee (IACUC), by the San Raffaele Ethic Committe and by the Italian Ministry of Health.
Data exclusions No data were excluded.

Replication
For gene expression profiling, leukemic cell purification and microarray profiling was for each sample performed in triplicate whenever possible. For all other experiments performed using patient-derived samples, each sample was tested in a single experiment, with an appropriate number of experimental replicates within the experiments. All attempts at replication of presented data were successful.
Randomization Not applicable to our study design (retrospective observational).

Blinding
Analysis of biological features was blinded to the clinical data, which were linked for correlation analysis only in a later stage.