DNA commission of the International Society of Forensic Genetics (ISFG): Recommendations on the interpretation of Y-STR results in forensic analysis

Forensic genetic laboratories perform a large amount of STR analyses of the Y chromosome, in particular to analyze the male part of complex DNA mixtures. However, the statistical interpretation of evidence retrieved from Y-STR haplotypes is challenging. Due to the uni-parental inheritance mode, Y-STR loci are connected to each other and thus haplotypes show patterns of relationship on the familial and population level. This precludes the treatment of Y-STR loci as independently inherited variables and the application of the product rule. Instead, the dependency structure of Y-STRs needs to be included in the haplotype frequency estimation process affecting also the current paradigm of a random match probability that is in the autosomal case approximated by the population frequency assuming unrelatedness of sampled individuals. Information on the degree of paternal relatedness in the suspect population as well as on the familial network is however needed to interpret Y-chromosomal results in the best possible way. The previous recommendations of the DNA commission of the ISFG on the use of Y-STRs in forensic analysis published more than a decade ago [1] cover the interpretation issue only marginally. The current recommendations address a number of topics (frequency estimators, databases, metapopulations, LR formulation, triage, rapidly mutating Y-STRs) with relevance for the Y-STR statistics and recommend a decision-based procedure, which takes into account legal requirements as well as availability of population data and statistical methods.


Introduction
Y-STR typing is an additional tool that can be used, typically in concert with autosomal DNA typing, for the detection of male DNA in mixtures that contain an excess of female DNA [2][3][4].Considering that under certain conditions a male minor contributor in a mixture may only be detectable by Y-STR typing, laboratories should in such circumstances pursue Y-STR analysis as the most appropriate means of detecting male contributor(s) in forensic samples [5,6].The laboratories should establish guidelines that define procedures, under which samples are subjected to Y-STR typing.We recommend that any sample should be conserved for Y-STR typing, where a mixture of male and female DNA is expected.An example of an obligatory application of Y-STR testing would be a vaginal swab, for which seminal fluid is detected, but sperm cells are not identified.
Due to the lack of recombination all Y-STR loci are physically linked on the Y chromosome and compose a haplotype, which is inherited along the paternal lineage.This linear transmission results in a nonrandom allele and haplotype distribution of Y-STRs creating a cluster structure, which is correlated with geographical and ethno-linguistic structures on a worldwide scale [7].The clusters reflect the appearance and subsequent expansion of paternal lineages in the past [8].The estimation of haplotype frequencies therefore requires a reference database, which represents the population substructures, and a statistical model that analyzes the target haplotype in relationship to prevailing subpopulation clusters.National guidelines, which are in place in the US and Germany follow these principles [9,10].Both guidelines recommend the Y chromosome haplotype reference database (YHRD) as the data source and a quantitative assessment of the evidential weight of a question to known (Q→K) match using profile frequency estimators based on database observations or based on Discrete Laplace parameters calculated from the database.Differences between the guidelines exist in the choice of the subpopulations.Whereas the SWGDAM guidelines recommend the use of the YHRD-embedded US database with subpopulations, in Germany the YHRD-embedded Western European metapopulation is recommended by default.In formulating general guidelines on Y-STR interpretation we assume that these already established guidelines can also be adapted for other countries, taking into account the legal context and the prevailing population substructure.We encourage national stakeholders to formulate national guidelines, which are in the scope of these universal recommendations but also account for the specific national circumstances and strategies.Some further considerations on Y-STR interpretation are not translated into general recommendations but are included here in a separate section.Finally, recommendations on the use and interpretation of Y-STR profiles in case of mixtures, kinship and identifications using familial analysis will not be covered here and will be presented later in a separate guideline paper.

Evaluation of Y-STR profiles
In a forensic setting, Y-STR loci exhibit the same general characteristics as their autosomal counterparts, namely: correlation of the amount of input DNA and peak height, the occurrence of https://doi.org/10.1016/j.fsigen.backwardstutter and drop-in/drop-out effects in case of low DNA amounts.The sensitivity of the Y-STR kits using capillary electrophoresis (CE) exhibits the same range compared to common autosomal STR kits, but the sensitivity for the male component is higher in unbalanced female/male mixtures [5].However, some mutational effects rarely seen in autosomal STRs are more pronounced in Y-STRs.Especially large-scale deletions, insertions and conversions [11] are responsible for higher numbers of Null (encoded "0") and multiple alleles at certain loci.Some markers included in commercial kits show always more than one allele because the sequence has identical copies on the Y chromosome (e.g.DYS385, DYF387S1).One or several Y-STR loci per haplotype can be involved in such mutation events.This is of forensic relevance, because a pattern can be erroneously interpreted as allelic drop-out, DNA contamination and mixture, which may affect the evidential value of a DNA profile [12].The largest repository of Y-STR variants is the YHRD database with currently 1,622 different alleles typed at 29 loci (Table 1).As with all Y-chromosomal polymorphisms the population genetic context needs to be taken into account for correct interpretation.For example the frequent Null allele at DYS448 has a worldwide frequency of 1/339 but of 1/54 in the Indo-Iranian metapopulation (YHRD Release 62 from 12/31/2019).Multiple Null alleles occur when a large-scale deletion affects a group of neighboring Y-STRs.A typical example is the deletion of the six common Y-STR loci DYS570, DYS576, DYS458, DYS481, DYS449, DYS627 along with the AMELY segment on the short arm of the Y chromosome.The YHRD haplotype search using the "0" for these deleted loci results in a frequency of 1/9,200 worldwide and 1/148 in the Indian metapopulation (YHRD Release 62).The same applies to duplication events, which can impact groups of neighboring Y-STR loci [13].

Decision process for Y-STR interpretation
An international ISFG guideline needs to comply with national legal requirements and existing guidelines on forensic Y-STR DNA examinations.Basically, examiners are expected to prepare reports with the minimum requirement of a qualitative statement on the Y-STR test result.Comparisons in which a known male reference is compared to a trace DNA by means of Y-STRs can result in three conclusions: inconclusive, exclusion, non-exclusion [14].These principal outcomes need an adequate verbal description in the report.Courts, however, require not only a qualitative statement but also a quantitative statement describing the weight of evidence for comparisons, in which a known male is included as a possible contributor to the Y-STR typing results obtained from a probative evidentiary sample.The decision tree (Fig. 1) displays a universal workflow with consecutive decisions to be made by the examiner with respect to the legal context, data and methods availability.Steps within the decision process are conditional.For example, data within a national database or metapopulation (Sections 5.2 and 5.3) need to be analyzed a-priori to allow substructures to be recognized and databases to be built accordingly.
The decision tree includes termini such as frequency estimation, counting method, Discrete Laplace method, national database, metapopulation, LR and verbal statement.To allow an informed decision on the appropriate reporting of Y-STR test results these termini will be explained in the following sections.

Frequency estimation using population data
A central component to evaluate the weight of DNA evidence in case of a match is a frequency estimate of the detected profile in the relevant population.This standard procedure applied in the autosomal case is also widely used for Y-STRs.However, as explained above, the Y-STR haplotype does not consist of independent Mendelian traits with frequencies that can be multiplied.Instead, the Y-STR haplotype is a single entity, for which the frequency needs to be estimated based on collections of individual samples from the population.Such data collections are provided in form of annotated databases (see Section 5).In forensic casework the probability of a match is evaluated either using count or model-based estimators of the profile frequency.Counting estimators (counts in a database accompanied by a confidence interval) are bound from below by 1/databases size and are, therefore, often  overly conservative especially for high-resolution multiplex kits generating profiles, which rarely match to a database.The counting method may thus reduce the evidentiary power inadequately in cases of non-observation.In contrast, methods like Discrete Laplace [15] are estimation procedures, which are based on an evolutionary model of haplotype diversification.They are sensitive to the composition of the underlying database (see Section 4.2).The Discrete Laplace method ensures that rare haplotypes retain their high evidential power even when the database used for estimation is of moderate size, provided that the ancestral haplotype clusters (central haplotypes) are represented in the sample.The counting as well as the Discrete Laplace methods have been established in practice and are part of national guidelines [9,10].Other reasonable methods with model-based estimators have been proposed [16,17].Also, models based on the presence of relatives of the person of interest in the population sharing a haplotype have been described [18,19].

Counting method
The counting method involves searching a given haplotype against a suitable reference database of size N to determine the number of times n the haplotype is observed in the database.The relative frequency of the haplotype in the database is then obtained by dividing the count by the number of haplotypes searched (n/N).Generally, the augmented counting method is used, where the haplotype in question is added to both the observations and the database (n+1/N + 1).
Different measures can be applied to cope with the uncertainty of the haplotype frequency estimate calculated by the augmented counting method.Mostly, a confidence interval is attached to the estimated haplotype frequency to capture the sampling effect of the database [20,21].Also, the number of observations n can be adjusted by inclusion of a kappa "inflation" factor (κ), which takes into account the composition of the database, namely the singleton proportion [22].Note, that if this proportion is close to 100 % the kappa method is not applicable.The application of the appropriate confidence interval and/ or the kappa method is not part of this universal recommendation and must be evaluated in the context of national Y-STR interpretation guidelines.

The Discrete Laplace method
The Discrete Laplace (DL) method [15] is a statistical model that can be used to estimate population frequencies of Y-STR haplotypes based on a reference database.The DL method is a parametric method.The normalized allele distribution of an STR marker can be theoretically modeled to follow asymptotically a certain distribution [23].The distribution can be approximated by a Discrete Laplace distribution (with two parameters).The method is only able to deal with integer alleles, loci with intermediate, multiple or Null alleles cannot be included in the DL analysis.The Discrete Laplace method assumes that a number of latent clusters with shared ancestry exists, each of which is represented by a central haplotype.The haplotypes in the population are then spread around these central haplotypes (caused by neutral stepwise mutations).Populations and databases can be pre-processed using the freely available, open-source R [24] package 'disclapmix' [25] in order to determine central haplotypes and other parameters [15].The number of clusters within these databases needs to be determined, e.g. as the number minimizing the minimal marginal Bayesian Information Criterion (BIC) [26].The DL method is currently implemented for 21 metapopulations in the YHRD database [21] for haplotypes with maximal 15 loci, which is the "YFiler format" without the duplicated locus DYS385 (Supplementary Table 1).The central haplotypes per metapopulation can be viewed and downloaded at https://yhrd.org/pages/resources/calculation_methods.

Databases
Six fundamental requirements should be met to qualify a reference Y-STR database for forensic use: (1) Anonymization -All haplotypes need to be fully anonymized by the submitting laboratory and database administrators must ensure that haplotypes cannot be traced back to donors; this includes restrictions on search functionalities, especially for high-resolution haplotypes with more than 17 loci ("YFiler format").( 2) Quality and integrity -Haplotypes collected for a database need to be completely typed for a validated Y-STR panel ("kit"); proficiency tests in the framework of national or international test schemes are necessary to qualify a laboratory as a submitter.(3) Annotation -Each haplotype submission must include metadata to allow assignment to meta-and subpopulations; depending on the database structure information on the geographical coordinates of the sample (sampling area or sampling location) as well as nationality, ethnicity or language group of the sampled individuals (self-information based on informed consent) is requested; phylogenetic information based on ancestry-informative Y-SNPs is a useful addition since it provides objective information on the past demographic background of the paternal lineage.(4) Size -The database for forensic application must be sufficiently large to represent the dominant lineages and clusters prevailing in the defined reference populations.The databases should therefore be continuously expanded.( 5) Sampling -Database samples need to be collected randomly to represent the extent of relatedness in the population as much as possible.( 6) Consistency -Since updates change size and composition of the database, all statistical values drawn from the database need to include a specific identifier of the version used at the time of the query.

YHRD
A large number of population studies (1,348 in release 62 of 12/31/ 2019) is made available by the YHRD, a scientific public reference database, which largely meets the aforementioned criteria.It is built by direct submissions of population data from individual laboratories working in the field of forensic, human and population genetics including crime labs, university departments and research laboratories.Upon receipt of a suitable submission, the YHRD custodians examine the originality and validity of the Y-STR and Y-SNP data and finally assign an accession number to the population sample.The submissions are then registered to the public database.All population data published in several of the prominent forensic journals such as Forensic Science International: Genetics or International Journal of Legal Medicine are required to be validated by the YHRD custodians [27,28] and are subsequently included in the YHRD.All haplotypes uploaded to the YHRD have a double assignment to subgroups defined by nationality (see Section 5.2) and by ancestry.The latter subgroups are called metapopulations (see Section 5.3).The metapopulation system as defined in the YHRD is given in Table 2. Populations with a pronounced admixture are not assigned to any metapopulation and marked "Admixed".Currently, Y-STR profiles with maximal 29 loci implemented in widely used PCR kits can be searched against 136 national databases and 32 metapopulations.Of the national databases, China is currently the largest with 106,194 reference haplotypes, followed by the USA (40,921) and Brazil (11,799), see https:// yhrd.org/pages/resources/national_databases(YHRD Release 62).It is important to state that the current metapopulation structure is an a-priori categorization, which needs a continuous evaluation by means of statistical methods (e.g. by Analysis of Molecular Variance (AMOVA), see https:// yhrd.org/amova) to assess the genetic similarity/dissimilarity between the population samples (Supplementary Fig. 1a-d).
A user's manual covering all aspects of the YHRD use can be downloaded from the "Help & Support" section.

National databases
The concept of pooling data to build national databases has a very straightforward explanation: law enforcement agencies and forensic services rely on their national population to build reference databases.In most instances offenders and victims stem from the national population.In this case it may be reasonable to claim that the appropriate database to represent the suspect population [29] is a national database.Such databases are based on national census information used to manage the diversity of the respective country.In countries like USA, Brazil, UK or China, which are characterized by strong population substructure, national reference databases are often built on basis of a historical concept of ethnic affiliation, e.g. the US population is substructured in Caucasian, African, Hispanic, Asian and Native American populations.The United Kingdom differentiates English, Afro-Caribbean, Chinese and Indo-Pakistani and the Peoples Republic of China lists 55 official nationalities residing within the country.Y-STR databases set-up by the national authorities are accessible by national crime labs.Some national databases are submitted to the YHRD and made freely accessible and searchable.

Metapopulations
The term metapopulation (MP) has been adapted from population biology [30] and is used in forensic genetics to describe a set of geographically dispersed human population samples with shared genetic ancestry [31].Random samples recruited independently in discrete populations and at different sites can be combined to build a metapopulation.Usually in a metapopulation a large number of Y chromosomes share phylogenetically informative mutations (SNP sites) and possess Y-STR haplotypes related by descent.Thus, population samples are more similar within a metapopulation than to groups outside the metapopulation [32].The trans-national metapopulation approach has advantages over the concept of national databases, since it represents ancestry groups instead of political entities.Y-STRs evolve slowly but steadily along paternal lineages and therefore ancestry is a better proxy to capture the haplotype distribution than ephemeral political entities, which often have come into existence only recently and define citizenship based on varying and incongruent criteria.In contrast to pure geographical grouping systems used to annotate populations, e.g. in pharmacogenetic research [33], the YHRD proposed a metapopulation system that uses not only geographical but also ancestry-related information, namely linguistic data to categorize population samples [21,31,32,34].Populations belonging to different language groups may reside in close geographical proximity but exhibit a significant genetic distance, for example, the Polish and the German population in the center of Europe [35].Multinational or multiethnic countries house very different ethnic groups.For example, in the YHRD, South Africans of Dutch or British descent are included in the European metapopulation and not in the Sub-Saharan African category (Table 2, Supplementary Fig. 1a).To perform the assignment of population samples to metapopulations the YHRD requests ancestry-related metadata.Recent demographic processes as migration can lead to highly admixed populations which cannot be allocated systematically.
Since the ancestry of the trace donor (which should not necessarily be equated with that of the suspect) is unknown, the location of the crime scene could be a leading criterion to select the appropriate metapopulation.A common misconception is that the origin of the reference person (suspect) is defining the appropriate metapopulation when there is no such prior information.Instead, the origin of any possible suspect should determine the choice of MP.For example, let us assume the suspect to whom the Y-STR profile matches is from Northern Africa, but the crime occurred in Germany.An examiner taking a neutral perspective should use the Western European MP as the suspect population and not the Northern African MP, otherwise this would assume that any possible suspect originates exclusively from Northern African MP.Only if additional information is provided by the court, the examiner should report also on that specified metapopulation.If more than one major descent group is defined by national census in one country, the DL values for each of the relevant metapopulations should be reported.

Reporting guidelines
To assess the value of a Y-STR profile, the first aspect to consider is whether the profile has sufficient information to be used in the respective case and whether or not it can be compared to the person(s) of interest.We will consider the case of a single-source profile, which is suitable to be compared to a person of interest (here: person X), both analyzed for the same set of loci.Fig. 1 illustrates different constellations and possible decisions.If the profiles of the trace and the person X differ by at least one allele (assuming no analytical errors) the constellation is "exclusion", which requires a verbal statement like "Someone other than person X is the source of the DNA".If the trace profile and person X possess identical alleles at each locus (identical length alleles if the CE method or identical sequence variants if massive parallel sequencing was used), the decision is "non-exclusion" and needs evaluation.We recommend a quantitative assessment of the value of the match using relevant population data (correct metapopulation or suspect population), the formulation of alternative hypotheses (e.g. that of the prosecution and defense), and use of the likelihood ratio to evaluate the findings and a verbalization.The value of the scientific result of this evaluation is dependent on the information used by the examiner.According to Gill et al. [36] this information comprises the relevant case circumstances, the data used, the assumptions and the model chosen.Relevant information on the case would be: "The event took place in Germany."This information will allow the examiner to select the most relevant population genetic database.This could either be a national database for Germany (n = 4,786 17-locus haplotypes, or "YFiler" in YHRD Release 62) or the relevant metapopulation, which is Western Europe (n = 27,063 YFiler haplotypes).Since in Germany the guidelines recommend the use of the Western European metapopulation instead of a German database, the profile frequency is retrieved for the Western European MP.Using the count or Discrete Laplace method an estimate of the profile/haplotype frequency in the reference population can be retrieved.For example, the haplotype in Table 3 has not been observed in a Western European YFiler database of N = 27,063, thus the frequency of that haplotype using counts will be 1/ 27,064 = 0.000037.The estimate using the Discrete Laplace method in the same database of N = 27,063 Western Europeans with zero observations is 1/252,740 or 0.0000039, a value roughly ten times less than the count value.Note that the DL values decrease in various metapopulations the more genetically distant the ancestral haplotype clusters are (Table 3).Further examples are given in Supplementary Table 2a-d.
When a suspect is identified that matches the Y-STR profile, we recommend the formulation of hypotheses according to the likelihood approach described by Evett and Weir [37].The alternative hypotheses can be defined as follows [38,39]: Hypothesis H1.Person X is the source of the DNA.Hypothesis H2.A random man Y from the reference population Z is the source of the DNA, and Y is another man than X.

Wording
The Y-STR profile detected in the crime stain is LR times more probable to observe under hypothesis H1 than under hypothesis H2.This notwithstanding, paternal relatives have a high probability to have the same Y-STR profile and will in that case have the same likelihood ratio (LR).
The chance that a male person, which is not closely related to the person of interest has the same Y-STR profile can be assessed using the Discrete Laplace method, which approximates the proportion of haplotypes in the suspect population that possess the same haplotype.The number of persons sharing the haplotype of the suspect is decreasing, if high-resolution multiplex kits or panels using rapidly mutating markers (Section 7.2) are used for analysis.On the other hand, male relatives, even distant ones, have a high probability of having the identical Y-STR profile even in the highresolution analysis [39,40].For example with a PowerPlex Y23 profile (23 loci), the probability is (under certain assumptions) 18 % for a relative 20 germ line transfers apart to have the same profile and still 5.5 % for a YFilerPlus profile with 27 markers [29].It has also been shown that even extensive testing of male relatives in five generations with 47 Y-STRs markers cannot exclude them from being the donor of the trace evidence [4].In such cases information not related to Y-STRs is needed to settle the case.

Reporting without frequency estimation
Reasons may exist not to use match probabilities and likelihood ratios to quantify the weight of evidence.A missing or small, inappropriate or barely representative database is a reasonable argument here (see Fig. 1).In this case examiners can report a match in form of a qualitative statement, which needs to address the issue of male relatedness and hence the extent of haplotype sharing in the population.
Using simulation experiments a recent publication [18] has approximated the number of shared haplotypes in the population and proposed wording on the number of close paternal relatives in a given population [41].However, the appropriate wording for statements not relying on population databases need to be validated in the context of the national guidelines.

Rapidly mutating Y-STRs
Rapidly mutating Y-STRs (RM Y-STRs) have an elevated mutation rate and therefore a higher chance to differentiate close relatives.Different RM Y-STR panels have been described and many of these markers are not part of commercial kits, although some are included in high-resolution kits such as YFilerPlus and PowerPlex Y23 [42,43].We suggest the use of RM Y-STRs to further analyze trace and reference samples in case of a match for possible exclusion, for example by using an additional assay [e.g.[44,45]].However, this strategy is limited by the amount of DNA available from the crime scene sample and the availability of commercial kits with additional panels of RM loci.The more RM Y-STRs are possible to type, the more variable the haplotype becomes and the higher the chance to find a meiotic mutation separating the relatives from the suspect.An RM Y-STR panel alone or in combination with a standard kit won't improve population frequency estimates, since (1) count estimates would require an extremely large database, which is impossible to build in practice and (2) DL estimates cannot be calculated because central haplotypes linked to founder lineages in a population cannot be identified due to the extreme resolution of RM Y-STRs.

Triage
Triage, as defined here, is the process of prioritizing the analysis methods in case of reduced signal intensity of available male DNA due to low template number and/or exceedingly high female background.The autosomal standard STR analysis always has priority to generate an eligible profile of the male DNA for database search either from a single-source DNA or a mixed profile (Priority 1).If the autosomal outcome is inconclusive for the male component, the sample should be subjected to a standard Y-chromosomal STR analysis (Priority 2), which selects the minor male target for amplification.If the sample is informative for the Y-STR profile and an individual matching the trace is identified, the third analysis, assuming sufficient DNA amount, would include the application of a dedicated RM Y-STR panel to exclude available close relatives (Priority 3).

Summary of the recommendations
The recommendations published here are the result of more than two decades of collaborative research, global sampling and adaptation of a variety of methods to forensic purposes.Numerous cases have been analyzed using commercial Y-STR kits, which are available since the late 1990s and have been continuously improved since then.Guidelines for Y-STR interpretation are already in place in countries like Germany and the USA and were demonstrated to be admissible in court both for the counting [46] as well as for the DL approach [4].We have formulated recommendations for the typical case where the autosomal analysis is unsuccessful or uninformative, but the Y-chromosomal analysis delivers an informative DNA profile.To assess the evidential value of a question to known (Q→K) match we recommend the Discrete Laplace approach as the frequency estimation method.This recommendation applies for countries for which representative groups with shared ancestry (metapopulations) can be defined and sampled.If more than one major descent group can be defined in one country, the DL values for each of the relevant metapopulations should be reported.An alternative, easily defendable but highly conservative method is the augmented counting approach optionally with confidence interval (s) or kappa inflation.The counting approach is recommended if Y-STR profiles are partial due to degradation or include non-integer alleles.Since frequencies for high-resolution haplotypes with 23 and more markers cannot be readily estimated using DL (due to the limited population coverage and the problem to define central haplotypes), the augmented counting or counting with kappa inflation is an alternative.However, a profile reduction to the "YFiler" format and subsequent Discrete Laplace analysis generally gains more information from a Y-STR profile match than counting methods.For quantitative reporting either using DL or count estimates we recommend the Likelihood approach.Generally, the qualitative approach with appropriate wording is recommended for laboratories in countries with insufficient coverage of the relevant reference population (s).Information beyond databases especially on the degree of relationship within the extended family and the suspect population, if known, has to be included in the report.
* includes Null alleles, intermediate alleles and multiple alleles.

Table 2
Metapopulation system of the YHRD and geographical range (YHRD release 62).

Table 3
Frequency estimates of an YFiler haplotype typed in a German sample in five out of 32 metapopulations using count and DL estimators (YHRD release 62).