STROBE-metagenomics: a STROBE extension statement to guide the reporting of metagenomics studies

The term metagenomics refers to the use of sequencing methods to simultaneously identify genomic material from all organisms present in a sample, with the advantage of greater taxonomic resolution than culture or other methods. Applications include pathogen detection and discovery, species characterisation, antimicrobial resistance detection, virulence profiling, and study of the microbiome and microecological factors affecting health. However, metagenomics involves complex and multistep processes and there are important technical and methodological challenges that require careful consideration to support valid inference. We co-ordinated a multidisciplinary, international expert group to establish reporting guidelines that address specimen processing, nucleic acid extraction, sequencing platforms, bioinformatics considerations, quality assurance, limits of detection, power and sample size, confirmatory testing, causality criteria, cost, and ethical issues. The guidance recognises that metagenomics research requires pragmatism and caution in interpretation, and that this field is rapidly evolving.


Background
The term metagenome was coined in 1998 to describe the collection of genomes from microbes present in environ mental soil samples by using approaches previously used to study single genomes. 1 The sequencing of genetic material from clinical samples has become common practice in research on clinical microorganisms. In this context, metagenomics refers to the application of sequencing methods that can identify coexistent genomic material from any organism present in patient samples (ie, microorganism and host nucleic acid), usually with the aim of pathogen identification for clinical diagnosis or research. [2][3][4] Examples of practical applications include pathogen detection and discovery, species characterisation or subtyping, antimicrobial resistance detection, virulence profiling, and studies of the microbiome and micro ecological drivers of health and disease. [5][6][7][8][9][10][11][12] Metagenomics is also being introduced as a diagnostic tool for causal studies of clinical syndromes (such as encephalitis), 13,14 for exploring the microbiome, 15,16 and for tracking disease outbreaks. 17,18 A current example of the transformational effect of direct sequencing of clinical samples has been the application for rapid investigation and dissemination of information on severe acute respiratory syndrome coronavirus 2 (SARSCoV2), which causes COVID19. 11,12 Metagenomics data are generated using high throughput sequencing methods, also referred to as deep, nextgeneration, massively parallel, or shotgun sequencing. In this Review, for simplicity, we refer to all these approaches as sequencing. We also include capture probe enrichmentbased sequencing methods that use nucleotide probes to increase sensitivity 4 and targeted amplicon sequencing-eg, sequencing the 16S ribosomal ribonucleic acid (rRNA) gene to identify bacteria. 19 Capture probe enrichmentbased sequencing and tar geted amplicon sequencing might not be considered true examples of metagenomics and are not the focus of our Review; however, some similar considerations about reporting of results apply.
Metagenomic sequencing has advantages for pathogen identification over conventional methods, such as culture or targeted PCR, because many or most microbial species present within a sample can be detected simultaneously with high taxonomic resolution. Detailed characterisation of microbial communities and population dynamics also enables the study of ecological interactions. Furthermore, this method does not require culture techniques, and

Key messages
• The term metagenomics refers to the use of sequencing methods to simultaneously identify genomic material from all organisms present in a sample, with the advantage of greater taxonomic resolution than culture or other methods. • Applications include pathogen detection and discovery, species characterisation, antimicrobial resistance detection, virulence profiling, and study of the microbiome and microecological factors affecting health. • Metagenomics involves complex and multistep processes and there are important technical and methodological challenges that require careful consideration to support valid inference. • We co-ordinated a multidisciplinary, international expert group to establish reporting guidelines that address specimen processing, nucleic acid extraction, sequencing platforms, bioinformatics considerations, quality assurance, limits of detection, power and sample size, confirmatory testing, causality criteria, cost, and ethical issues. • The guidance recognises that metagenomics research requires pragmatism and caution in interpretation, and that this field is rapidly evolving. Reporting standards should support clarity, consistency, and robustness of research.
Review therefore can be used for microbial species that are difficult or time consuming to grow. This is particularly relevant for diagnostic applications, where routine culture is in decline. 20,21 However, appropriate study design for metagenomics research is not well defined and metagenomic tech nologies pose important technical challenges. These challenges include methodological artefacts introduced by wet laboratory methods and the effect that different computational approaches have on the analysis of multivariate and complex data. Furthermore, the ethical implications of sequencing are substantial and data privacy consid erations are increasingly recognised. The multiple steps and different expertise required to generate and analyse metagenomic sequence data involves numerous decision points, which could introduce bias and affect downstream inference about the presence and abundance of microbial species in the sample.
A metagenome result should therefore be interpreted as one of many possible representations of the true sample composition of a given microbiome. Understanding and reporting sources of bias and limitations to valid inference should improve protocol performance and enable meta genomic research to proceed with transparent recognition of the limitations. However, existing reporting statements for epidemiology studies, including STROBE (STrengthening the Reporting of OBservational studies in Epidemiology) 22 and its infectious disease molecular epidemiology extension, STROMEID (Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases), 23 do not fully address issues specific to meta genomics. For this reason, scientific journals, and their readers, might not be adequately equipped with a standardised set of guidelines to evaluate and critically appraise clinical and epidemiological studies applying metagenomics. We aimed to improve the clarity and consistency of metagenomics research reporting, ranging from clinical diagnostics to microbiome studies, with suggestions for optimal practice and recom men dations for robust and accurate reporting.

Titles and abstracts
The term metagenomics should be included in the title or abstract, and the keywords of the study when these methods contribute substaintially to the results reported Clear and concise language incorporating standardised terminology, with references if appropriate, enables the accurate indexing of published studies in recognised databases. This is crucial for easy information retrieval and knowledge dissemination. For example, a systematic literature review of studies applying metagenomics in encephalitis using medical subject headings and keyword searches for the terms sequencing or metagenomics in four databases (PubMed, Embase, Web of Science, and Cochrane) 13 failed to identify two relevant studies that did not report the terms. 25,26 These studies were identified by experts in the field who were directly involved with the studies.

Describing methods and study design
Describe specimen collection, handling and storage processes, and nucleic acid extraction methods Steps involved in sample collection, handling, and processing are frequently poorly reported in publications and yet they will have considerable effect on the results and reproducibility of a study and could introduce variability artefacts. [27][28][29][30] In particular, many studies use material banked and collected originally for other pur poses. In this Review, we describe important potential sources of error and their contribution to bias.
Nucleic acids, particularly RNA, are labile. Consequently, the collection methods, addition of nucleic acid stabilisers, and time to processing can affect the results obtained. 31 To address these issues, reporting should include durations, volumes, temperatures, and methods used before, during, and after the storage of samples. 32,33 Extraction methods contribute to another major source of methodinduced variation-eg, by being DNA or RNA specific, or tailored to specific organism types-so should be described. 34 Other details of sample preparation methods should also be reported including filtration, centri fugation, DNA digestion, rRNA depletion, separation in RNA or DNA, and random amplification. Standardised protocols of sample preparation methods should also be followed, if available and appropriate, and documented clearly in the publication methods. Authors should also consider submitting to standardised protocol repositories to provide transparency in the study design and methodology.

Describe sequencing methods, including sequencing depth
Different metagenomic sequencing platforms might produce different types of reads-eg, single versus pairedend, and short (100-300 bp) versus long (>1000 bp). Sequencing platforms have different error rates, with the probability of a nucleic base being read incorrectly ranging from less than 0·01% for Illumina sequencers to 5-10% for Oxford Nanopore Technologies sequencers (current figures as of February, 2020). 35 Additionally, sequencers often read a base incorrectly when processing samples with large homopolymer repeats, GCrich, structurally repetitive, and other complex regions of the genome. Consequent false positive and falsenegative errors need consideration when reporting species composition. 36 Sequencing depth refers to the number of times a particular nucleic base is represented within reads or the redundancy of coverage, 37 and has implications for identification of low abundant transcripts and confidence in sequencing data. However, sequencing depth must be balanced according to the research question and the available resources. There are several factors that affect sequencing depth, including the sequencing platform www.thelancet.com/infection Vol 20 October 2020 e253 Review and the sequence that is being read (eg, species diversity of the sample). [37][38][39]

Describe methods used for bioinformatics analysis
For the purposes of this statement, the term bioinformatics applies to all analysis steps involving raw sequencing data, including base calling, demultiplexing, trimming and removal of reads (eg, reads of low quality, low complexity, adapters and indexes, or of human origin), read nor malisation, alignment of sequence reads to reference databases, denovo assembling of genomes, and taxonomic assignment of reads, assembled contigs, or both. There are multiple viable options for many of these tasks, with ongoing debate in the community about optimal methods, which can depend on the scientific question at hand. The field of metagenomics is developing rapidly and methods once consid ered best practice can be superseded following new analytical advances.
There should be clear descriptions of the bioinformatics methods used, including, at a minimum, the software name, version, and the main commands run with values for the essential parameters or flags. It is also advisable to make data and programming code open access, whether as supplementary files or shared online-eg, via Github or Figshare. Where possible, a versioncontrolled container, package, or easily installable version of the complete analytical pipeline (including all dependencies and required databases) could be made available for download and review. The open source release of bioinformatics workflows should be encouraged wherever possible to improve transparency and reproducibility, and should include adequate validation datasets, meaningful docum entation, and examples of expected outputs and reports (appendix pp 1-2).

Describe quality assurance methods, including internal and external quality controls
An important strength of metagenomics analyses is their ability to detect any genomic material present within one sample. However, detection applies equally to true sample material and to any contaminating nucleic acids present in a sample, which can be introduced at any stage from sample collection to processing. For example, contamination could come from the extraction kit, the socalled kitome, 40 or at the point of specimen collection. Sampling is rarely done under completely sterile con ditions, and tissues obtained from tissue banks are therefore often contaminated. Low biomass and low abundance sites (for example tumours, the brain, and fetal tissues such as the placenta) are particularly prone to the risk of misclassifying contaminants.
To show attempts to ensure internal validity and reproducibility and identify potential contamination, internal controls for all extraction and sequencing pro cesses should be reported as part of standard operating procedures. 4,27 Positive controls are usually spiked with DNA or RNA-eg, synthetic nucleic acid standards such as sequins 47 -and negative controls are usually a blank (eg, water) sample or ideally a similar or identical matrix (tissue, body fluids, etc) that are expected to contain no microorganism genomic material based on patient factors and test results. For clinical metagenomics, formal laboratory implementation involves a system of external controls. Arranging this system of external controls is difficult; however, publicly and commercially available controls and mock community samples are now available and we recommend that their use should be reported. 48,49 Describe use of orthogonal methods to confirm pathogen identity, function, and viability The conventional methods in microbiology for con firming the presence of a pathogen are culture or growth of the pathogen from a clinical sample and immuno histo chemistry, the histological localisation of candidate species in tissue biopsies. However, traditional culture can be difficult when antibiotics have been administered before sampling or for pathogens that are slow growing, fastidious, present in lowconcentration, or currently undescribed. Sequencing has high discrim inative power and could have higher sensitivity than culturebased methods. For example, in a polymicrobial sample, growth can be affected by presence of other competing bacteria or by inadequate growth conditions. Metagenomics methods have consistently shown higher classification accuracy when comparing taxonomic profiles of synthetic poly microbial samples obtained from extended quan titative culture with nonselective media. 50 Confirmatory assays appropriate to the study setting, justification for the methods used, and a description of their limitations should be reported. For cases in which confirmatory assays are not possible (eg, because of high cost or low volume of samples) an explanation should be provided. Rigorous validation of the method used, particularly for pathogens and proficiency testing, especially in clinical laboratories should be described (appendix pp 2-3).

Describe the criteria used to assess the role of pathogens in disease aetiology
Confirming the presence of microbial DNA or RNA in association with disease is an important step in establishing a causal relationship between a micro organism and disease. 51,52 A major challenge for meta genomics research and diagnostics is distinguishing pathogens from commensals or contaminants. 53,54 Inter pretation of microbiome investigations can be further complicated if a misbalance in variation and abundance of different bacteria-sometimes referred to as dys biosis-is suspected to be the cause of the condition. 55 It is also worth considering that the cause of some diseases might involve multiple sequential or interacting species, which can be collectively important. 56,57 Further more, sequencing investigations can identify novel organisms, for which the clinical significance will be unknown. Several criteria to establish causality have been proposed over the past century, including the incorporation of metagenomic technologies (appendix 7-9). 58,59

State the time from collection to results and cost consideration
The time from sample collection to processing (transport time), including coldchain transportation and transit, can affect the compositional profile of microorganisms inferred from metagenomics. Over growth or degradation can occur during the period between collection and (cryo) storage with the result that the sequencing profile may not accurately reflect the composition of the sample at the time of collection. An extended duration of storage can result in a shift in the relative representation of bacterial taxa and substantial variability in metagenomics data. For example, faecal samples stored for longer than 3 months at -80°C experience selective loss of Bacteroides spp. 6, 60,61 If the sample is obtained post mortem, it is essential to report the time from death to sample acquisition given extravasation of gut bacteria into the bloodstream that can complicate interpretation of metagenomic data. For some applications, it might be relevant to report the overall turnaround time of the bioinformatic analysesie, including computational time for bioinformatics analysis. For example, Oxford Nanopore technology may be deployed in the field or at point of need, allowing sequencing to be done rapidly in near realtime; still, actionable results are also dependent on the time required for computational analysis. 62,62 The turnaround time of bioinformatic analyses is crucial in the context of clinical applications, when metagenomics is used to help to guide or tailor patient treatment. Variables such as sequencing run time and total computational analysis time (with system specifications-eg, number of cores and amount of memory used) should be stated clearly, as should the sequencing depth. 64

Setting State whether sample collection was retrospective or prospective
As described in the STAndards for Reporting of Diagnostic accuracy (STARD) guidelines, clarity is needed regarding the sequence of events in diagnostic testing to ensure that sources of bias are addressed. 65 The analyte can degrade if there is a long time in between sample collection and the metagenomics assay. Retro spective sampling might also lead to bias in the samples tested. For instance, when comparing studies of unidentified encephalitis, samples retrospectively selected for metagenomics might be those that are difficult to diagnose (eg, with a low titre) or taken at later timepoints in the course of infection, and therefore more likely to be noninfectious. 66 Figure 1: Sources of uncertainty diagram highlighting potential contributing sources For simplicity, this figure considers the sequencing of DNA from an environment and does not consider the process beyond the data output from the sequencer. The arrows pointing towards the central black arrow show the experimental process from left to right and the sources of variability that could contribute uncertainty. Conceptually it is clear how some of these factors contribute to systematic effects (bias). However, in addition these factors also contribute to the random error (variance) that will influence the precision of a potential finding. QC=quality control. Most diagnostic and public health laboratories do not yet use metagenomic technologies routinely. As such, patients included in metagenomics studies are often from tertiary referral or specialist centres, which are unlikely to be representative of the wider population, as discussed in STROBE and STROMEID. 22,23 This limi tation can introduce challenges for appropriate selection of controls for casecontrol studies and for studies assessing the strength of disease associations.
Species composition of human microbiomes are affected by various host factors, including age, sex, behaviour (eg, diet and lifestyle), and environment. 67,68 Exposure to pharmacological substances can also profoundly influence microbiome composition. For example, a single standard course of antibiotics has been shown to alter species composition of the gut and oral microbiomes for over a year. 69,70 Matching of cases and controls is particularly challenging for metagenomics studies given the broad range of microbes considered. 71 Metagenomics studies should aim to minimise and statistically control for host confounders or, at a minimum, list those confounders that might affect interpretation of results.

Bias
Bias is a source of error that remains constant with repli cation affecting trueness; 72 it is separate to random error, which affects the precision of an experiment. Together, these sources of error contribute to measure ment un certainty that, when conducting metagenomics sequen cing, has many potential sources (figure 1). Replication, including replication of the whole process, provides a means to estimate random error, which can vary when using different sequencing strategies. 72 Adherence to strictly described laboratory protocols can improve random error and reproducibility, 21 but it cannot be used alone to remove bias.

Address potential sources of bias (sampling, transport, storage, library preparation, and sequencing)
Bias can occur at each step of a diagnostic sequencing pipeline (panel 1) and is more difficult to evaluate than random error. For metagenomics studies, microbiological contamination of samples can introduce bias. Experi mental bias that is caused at different stages of a metagenomics experiment is more challenging to control for than selection bias or contamination. The fact that the microbiome is composed of many different micro organisms means that a given protocol could lead to certain groups being overrepresented in the processed samples. For example, enrichment protocols can introduce bias for pathogen detection. 73 Capture probetargeted sequencing will limit detection to targeted sequences, and 16S rRNA gene sequencing has limitations with regard to the level of taxonomic classification. This precise form of bias does not exist in untargeted metagenomics; however, other experimental bias can occur at different protocol stages, including during sampling, nucleic acid extraction, 74 or postextraction steps. 75 Studies using 16S should consider that different primers amplify different bacterial families with varying degrees of success because of mismatches, resulting in potential bias in abundance and diversity metrics, 76 which cannot be completely corrected bioinformatically. 77 By reporting the potential sources of bias for a given study (figure 1) their potential influence can be considered with mitigation or compensation strategies or caveats made to improve interpretation. The complexity and multistep nature of microbiome measurement means that any metagenomics experiment should be considered and reported as a representative

Specimen collection methods
Collection without a cold chain, or nucleic acid stabilising agents, can cause nucleic acid degradation and potential false-negative results or overgrowth of selected organisms, which leads to misinterpretation of abundance. Multiple freeze-thaw cycles can also cause nucleic acid degradation.

Nucleic acid extraction method
The absence of a bead-beating step could make the detection of some bacteria difficult (ie, bacteria do not lyse properly so their DNA is not released and will not be sequenced). Small specimen volumes can reduce the ability to detect low-level organisms.

Sequencing library preparation
Poly-A tail enrichment of RNA will not include fragmented pathogen genomes; DNA sequencing alone will not detect RNA viruses.

Targeting of sequences
Capture probe-targeted sequencing will limit detection to targeted, known sequences. 16S targeted sequencing, as opposed to whole genome sequencing, will have limitations for the level of taxonomic classification.

Sequencing methods
High-level sample multiplexing can lead to insufficient read depth to detect organisms present at low levels. Computational contamination can occur between samples pooled on the same sequencing run due to a sample barcode for a sequence being misread and misassigned to another sample on the same run. 82 This is termed barcode bleed-through; dual barcodes drop the rate of bleed through dramatically compared with single barcodes. Unique molecular identifiers are an even more powerful way to identify this phenomenon when compared with dual barcodes.

Processing controls
Negative controls allow some contaminating organisms to be identified. Internal positive controls, reference standards such as sequins, reduce bias introduced by experimental variability and can improve recognition of low-level organisms.

Analysis methods
A small curated database, or highly stringent criteria might not include novel or unexpected organisms, leading to false negative results. An uncurated database or lenient criteria might also identify organisms incorrectly. *This list is not comprehensive, but illustrates how results can be affected by collection, processing, and analysis methods.

Review
result, rather than assuming that it perfectly reflects the microbes present and their abundance. It is also why the term unbiased, which is often used when describing metagenomic experiments that do not use enrichment, should be used with caution (or not at all). The term untargeted meta genomics could be used instead (appendix pp 3-4).

Address potential bias introduced by bioinformatics analysis
Classification algorithms rely on alignment of sequencing reads and contigs obtained from overlapping reads against reference genomes. In the case of the alignment of assembled contigs, reads that cannot be built into contigs (unassigned reads) are discarded, which can lead to a potential loss of information. 78 Classification of reads might be slow and a smaller database could be built with unique sequences representing certain taxa. 79 However, this can lead to bias in the assignment of homologous sequences and should be clearly reported.
Samples containing low abundance pathogens might produce falsenegative results by not classifying sequen cing reads as relevant or produce falsepositive results if reads are nonspecific. 80 Subsequent alignment of sequence reads against a reference genome of the candidate pathogen(s) identified by the metagenomics analysis can provide necessary validation-wide and distributed coverage of the reference genome and high mapping identity is unlikely to result in a false positive. The level of coverage might be limited in samples with low pathogen load but still can be a truepositive result. Sufficient read depth is not always available for metagenomics data from clinical samples, which often contain a large proportion of reads derived from the host. Additionally, high read depth can generally be achieved only for microbes present at highcopy number. Authors should report where these considerations are relevant.
Assessing the quality of reads before downstream classification is crucial for ensuring accuracy of taxonomic assignment. This quality control usually includes removal of adapters, background sequences (human, host, or known), lowcomplexity sequence reads, trimming of lowquality bases at the ends of reads, and removal of primer sequences. The total number of reads in each sample can be affected by factors including DNA extraction methods, sample handling, library preparation, differences in sequencing depth. As such, it is generally advisable to normalise read abundance between samples before any analysis and report where this is done. 81 Sophisticated statistical modelling approaches can deal with variation in read numbers between samples without loss of data (eg, DESeq2). 82

Describe or address limitations of reference databases
The use of reference databases should be clearly described. It is crucial that the reference database, genomic data download date, and a description of the procedures behind the inclusion and indexing of reference sequences are clearly presented. Limitations of reference databases can interfere with correct assignment of sequences (figure 2). Curated reference databases might not include all the relevant microbial diversity. Conversely, noncurated databases can comprise incorrectly named, incomplete, low sequencing quality, or artefactual sequences. 83 Studies have shown that sequences arising from sample con tamination or incompleteness (eg, an incomplete region of a genome that contains an important mutation) are frequent features of reference databases, particularly when draft The pie chart provides the full metagenomic composition with the bar providing the species composition excluding host DNA and contaminants. (B) Taxonomic profiling based on database 1. Species confidently assigned are highlighted by colours with unassigned species shown in grey. Using database 1, species A, B, and D are correctly assigned. Species that are misassigned are outlined with a circle. In this instance, sequences from species C are assigned to the closely related species C' because of the lack of a representative of species C in the reference database. Additionally, the reference database contains a partially contaminated sequence from species E, which is misassigned to contaminant sequences in the test clinical metagenomics sample. This affects the inference of species composition shown in the bar. (C) The addition of species F to database 2 allows assignment of a greater proportion of the species present in the original clinical metagenomics sample. Quality control and improvement of reference species E, now species E (QC), removes the spurious assignment of contaminant species. Species C is still misassigned to species C', its closest representative in the database. (D) Updating the reference database to include species C results in the correct assignment of sequences to species C rather than species C'. Species F is taxonomically reassigned to species X, leading to a change in the assigned species name despite no change in the data in the reference or query datasets. In all cases the pink sequences present in the original clinical metagenomics sample are not assigned as this species is not present in any of the three reference databases.  24 and 2250 NCBI GenBank draft bacterial and archaeal genomes contain spurious human sequences. 84 Additionally, falsenegative results might be due to a focal species missing taxonomic representation in the databases, which have an inherent curatorial bias to known human associated pathogens (appendix pp 4-5). 85

Study size
Describe clearly how power calculations were made Whenever comparisons in metagenomic species com position between two or more groups are made, authors should report relevant parameters such as significance level, power threshold, sequencing depth, effect size, number of comparisons, methods used to correct for multiple comparisons, and details of the statistical methods used for power calculations. It should be clearly stated how an effect size was derived and a rationale for the clinical relevance of the specific effect size should be given. If no power calculation was made, an explanation should be given about why this was not considered feasible or useful (appendix pp 5-6).

State the limit of detection, including analytical sensitivity and specificity
The limit of detection (LOD) refers to the minimum quantity of genomic material from an organism required for its detection and should be stated in metagenomics studies. Determination of the LOD for a metagenomics study is dependent on the sequencing technology, sequencing depth, read length, representation of genomes related to the taxa of interest in the reference database, and the complexity of the community and amount of host nucleic acid in the sample. Simple calculations give estimates for the LOD (eg, for 10⁶ reads per sample, the LOD is one read per sample), which corresponds to a relative abundance of the order of magnitude of 10 -⁶ (ie, ~0·0001%). Formal calculations of LOD that are needed for clinical validation should be done using probit analysis. 86 In practice, the LOD will be considerably higher than that derived from these calculations because a single read from a taxon is very likely to be due to contamination or misclassification. Rather than trusting such calculations, the use of positive (spiked) controls and negative controls in the sequencing run allows assessment of sensitivity and specificity. With a single infection, the number of ontarget reads will be correlated with the signal in the sample but mixed infections and coinfections will influence sensi tivity. 87 Experimentally validating these for model organisms that represent the specific pathogens of interest (eg, a DNA virus, an RNA virus, Gramnegative and Gram positive bacteria, etc) is recommended, particularly for diagnostic tests.

Discussion
Attempt or acknowledge the need for functional or phenotypic validation Genotypic data do not always correlate with clinical phenotype; for example, mechanisms that involve inducible resistance, gene expression and regulation, or posttranslational modifications. In studies investigating mixed microbial communities it may not always be possible to determine which taxon a particular gene belongs to. 88,89 This is also relevant in the establishment of causality. Efforts should be made to undertake phenotypic and functional validation to assess the inferred results. If this is not possible, or beyond the scope of the study, the limitations of inferring results solely from genotypic data should be acknowledged and discussed, including known caveats and restrictions on making key assumptions.

Consider the need for species or strain resolution
Different strains or lineages within a species can differ widely in their phenotypic characteristics. For example, sequencing with strainlevel resolution enabled identifi cation of specific strains of Escherichia coli associated with necrotising enterocolitis in preterm newborns 90 and lineages of Salmonella enterica associated with varying clinical phenotypes. 91 Therefore, profiling microbial communities with subspecies resolution can be useful, although de novo assembly of metagenomic reads remains a methodological challenge.
The strain and species resolution capacity of the assay used should be clearly stated with consideration for how the resolution applies to the study in question. In particular, microbial community profiling using 16S rRNA gene sequencing cannot identify individual species within some genera and should never be used to identify to the strain level. As recommended in STROMEID, a definition or reference to published definitions of a strain should be provided. 23

Report any ethical considerations with specific implications for metagenomics
Metagenomics produces a vast amount of host and pathogen data, which are untargeted and sometimes not of immediate interest. 92 Molecular methods to deplete human genomic material exist; however, they remain imperfect. It might be sufficient to detail in a protocol that the host data will be removed, and not analysed, although this approach could lead to bias in microbial reads caused by the in silico hostdepletion method-host genomes can contain viable viral genomes and nonviable genetic material derived from or shared with micro organisms. In these cases, the method used to identify and exclude host reads-eg, through mapping of all reads to the host reference genome-should be reported. including the choice of mapping algorithm and programme parameters.

Review
Even if data analysis is restricted to nonhuman reads, it could still unveil potentially sensitive information, 93 such as a new diagnosis of HIV. It has also been shown that more than 80% of individuals can be identified from populations of hundreds using their gut microbiome profile. 94 These issues pose real concerns, particularly with the increasing requirement for data to be made publicly available. For all these reasons, specific ethical implications relating to metagenomics data and corresponding approvals should be stated, and appropriate ethical approval should be obtained.

Conclusions
Metagenomics has already made a significant impact on pathogen detection and characterisation, and we probably still underestimate its full potential. Increasing use of metagenomics has been accompanied by recognition of complex issues at every stage in the pipeline-ie, sample collection, sequencing, and analysis. Standards for reporting are therefore needed to ensure clarity, consistency, and robustness of research. The guidance given in this paper constitutes a set of recom mendations and we recognise that research studies need to be pragmatic and use available resources. Nonetheless, reporting known and potential limitations should minimise misrepresentation. It is inevitable that the field of metagenomics will continue to advance steadily and these guidelines will need to be updated.

Contributors
TB and NF conceived the idea and, together with CO, coordinated the Review. DOS and JH designed figure 1 and LvD and FB designed figure 2. All authors were involved in the study design, literature review, writing the manuscript, and editing successive drafts.

Search strategy and selection criteria
In 2018, a STROBE-metagenomics working group was established, identified through notable researchers in the field, including a geographically diverse group of epidemiologists, statisticians, bioinformaticians, neurologists, virologists, microbiologists, and specialists in public health and infectious diseases. Participants met to agree the structure and content of the statement, and the proposal was registered with the Equator Network. 24 Specific issues to be covered were identified (panel 2). A systematic approach was taken to gather evidence to support the recommendations, with literature searches performed in PubMed, searching references of articles, and supplemented by expert opinion. Literature searches were done in PubMed using medical subject headings terms and keywords "(?sequenc* OR metagenom* OR Illumina OR RNA-seq OR RNASeq OR (Roche 454) OR (Ion torrent) OR (Proton / PGM) OR MiSeq OR HiSeq OR NextSeq OR MinION OR Nanopore OR PacBio) AND (infectio* OR microorganism OR microorganisms OR pathogen OR pathogens OR bacteria* OR virus OR viral OR fungus OR fungi OR parasite OR parasites OR parasitic)", searching references of articles, and supplemented by expert opinion from within the group. Articles were limited to those in English language published between January, 2000, and June, 2019. Areas that were adequately addressed in existing STROBE 22 and STROME-ID 23 statements were not covered. Iterative versions of the guidelines and manuscript were circulated to develop a consensus. The STROBE-metagenomics extension has been developed to complement the STROBE and STROME-ID statements, with the new recommendations organised alongside the existing table. The guidelines discussed therefore cover only the new proposals for reporting.

Panel 2: Key issues to be addressed in publications applying metagenomics
• Specimen collection, handling, preservation, and storage • Nucleic acid extraction • Sequencing instrumentation and processing, including library preparation • Bioinformatic analysis method, including workflow, database composition, and parameterisation • Quality assurance measures, including internal quality control, such as the use of adequate internal and external controls • Limits of detection, including analytical sensitivity, and specificity for clinical testing • Power and sample size calculations • Use of orthogonal methods to confirm sequencing results • Criteria to confirm the role of pathogen(s) in disease aetiology • Turnaround time • Cost • Ethical considerations • Specific issues related to applications, such as in the diagnosis of CNS infections, and investigation of antimicrobial resistance