Two pandemics of respiratory distress diseases associated with zoonotic introductions of the species Severe acute respiratory syndrome-related coronavirus in the human population during 21st... Show moreTwo pandemics of respiratory distress diseases associated with zoonotic introductions of the species Severe acute respiratory syndrome-related coronavirus in the human population during 21st century raised unprecedented interest in coronavirus research and assigned it unseen urgency. The two viruses responsible for the outbreaks, SARS-CoV and SARS-CoV-2, respectively, are in the spotlight, and SARSCoV-2 is the focus of the current fast-paced research. Its foundation was laid down by studies of many coronaand related viruses that collectively form the vast order Nidovirales. Comparative genomics of nidoviruses played a key role in this advancement over more than 30 years. It facilitated the transfer of knowledge from characterized to newly identified viruses, including SARS-CoV and SARS-CoV-2, as well as contributed to the dissection of the nidovirus proteome and identification of patterns of variations between different taxonomic groups, from species to families. This review revisits selected cases of protein conservation and variation that define nidoviruses, illustrates the remarkable plasticity of the proteome during nidovirus adaptation, and asks questions at the interface of the proteome and processes that are vital for nidovirus reproduction and could inform the ongoing research of SARS-CoV-2.(c) 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Show less
Motivation: To facilitate accurate estimation of statistical significance of sequence similarity in profile-profile searches, queries should ideally correspond to protein domains. For multidomain... Show moreMotivation: To facilitate accurate estimation of statistical significance of sequence similarity in profile-profile searches, queries should ideally correspond to protein domains. For multidomain proteins, using domains as queries depends on delineation of domain borders, which may be unknown. Thus, proteins are commonly used as queries that complicate establishing homology for similarities close to cutoff levels of statistical significance.Results: In this article, we describe an iterative approach, called LAMPA, LArge Multidomain Protein Annotator, that resolves the above conundrum by gradual expansion of hit coverage of multidomain proteins through re-evaluating statistical significance of hit similarity using ever smaller queries defined at each iteration. LAMPA employs TMHMM and HHsearch for recognition of transmembrane regions and homology, respectively. We used Pfam database for annotating 2985 multidomain proteins (polyproteins) composed of >1000 amino acid residues, which dominate proteomes of RNA viruses. Under strict cutoffs, LAMPA outperformed HHsearch-mediated runs using intact polyproteins as queries by three measures: number of and coverage by identified homologous regions, and number of hit Pfam profiles. Compared to HHsearch, LAMPA identified 507 extra homologous regions in 14.4% of polyproteins. This Pfam-based annotation of RNA virus polyproteins by LAMPA was also superior to RefSeq expert annotation by two measures, region number and annotated length, for 69.3% of RNA virus polyprotein entries. We rationalized the obtained results based on dependencies of HHsearch hit statistical significance for local alignment similarity score from lengths and diversities of query-target pairs in computational experiments. Show less
The present outbreak of a coronavirus-associated acute respiratory disease called coronavirus disease 19 (COVID-19) is the third documented spillover of an animal coronavirus to humans in only two... Show moreThe present outbreak of a coronavirus-associated acute respiratory disease called coronavirus disease 19 (COVID-19) is the third documented spillover of an animal coronavirus to humans in only two decades that has resulted in a major epidemic. The Coronaviridae Study Group (CSG) of the International Committee on Taxonomy of Viruses, which is responsible for developing the classification of viruses and taxon nomenclature of the family Coronaviridae, has assessed the placement of the human pathogen, tentatively named 2019-nCoV, within the Coronaviridae. Based on phylogeny, taxonomy and established practice, the CSG recognizes this virus as forming a sister clade to the prototype human and bat severe acute respiratory syndrome coronaviruses (SARS-CoVs) of the species Severe acute respiratory syndrome-related coronavirus, and designates it as SARS-CoV-2. In order to facilitate communication, the CSG proposes to use the following naming convention for individual isolates: SARS-CoV-2/host/location/isolate/date. While the full spectrum of clinical manifestations associated with SARS-CoV-2 infections in humans remains to be determined, the independent zoonotic transmission of SARS-CoV and SARS-CoV-2 highlights the need for studying viruses at the species level to complement research focused on individual pathogenic viruses of immediate significance. This will improve our understanding of virus-host interactions in an ever-changing environment and enhance our preparedness for future outbreaks. Show less