Search results

(1 - 16 of 16)

Liu, X. 2022

To explore drug space smarter: artificial intelligence in drug design for G protein-coupled receptors

Doctoral Thesis

open access

Over several decades, a variety of computational methods for drug discovery have been proposed and applied in practice. With the accumulation of data and the development of machine learning methods... Show moreOver several decades, a variety of computational methods for drug discovery have been proposed and applied in practice. With the accumulation of data and the development of machine learning methods, computational drug design methods have gradually shifted to a new paradigm, i.e. deep learning methods have attracted particular interest in drug design. In this study, a new deep learning-based method (DrugEx) was proposed to design de novo drug-like molecules. It was proven that candidate molecules designed by DrugEx had a larger chemical diversity, and better covered the chemical space of known ligands. In order to address the issue of polypharmacology, the DrugEx algorithm was updated with multi-objective optimization towards multiple targets. The results of its application demonstrated the generation of compounds with a diverse predicted selectivity profile toward multiple targets, offering the potential of high efficacy and lower toxicity. In order to improve its generality, DrugEx was further updated to have the capability of designing molecules based on given scaffolds. We extended the architecture of Transformer to deal with each molecule as a graph. As a proof, its effectiveness in that 100% valid molecules are generated and most of them had predicted high affinity towards A2AAR with given scaffolds. Moreover, GenUI was developed as a visualizion software platform that makes it possible to integrate molecular generators within a feature-rich graphical user interface to facilitate collaboration in the disparate communities interested in computer-aided drug discovery.These studies highlight the overwhelming power of AI methods in drug discovery. Show less

Ortiz Zacarías, N.V.; Bemelmans, M.P.; Handel, T.M.; Visser, K.E. de; Heitman, L.H. 2021

Anticancer opportunities at every stage of chemokine function

Article / Letter to editor

open access

The chemokine system, comprising 48 chemokines and 23 receptors, is critically involved in several hallmarks of cancer. Yet, despite extensive efforts from the pharmaceutical sector, only two drugs... Show moreThe chemokine system, comprising 48 chemokines and 23 receptors, is critically involved in several hallmarks of cancer. Yet, despite extensive efforts from the pharmaceutical sector, only two drugs aimed at this system are currently approved for clinical use against cancer. To date, numerous pharmacological approaches have been developed to successfully intervene at different stages of chemokine function: (i) chemokine availability; (ii) chemokine-glycosaminoglycan binding; and (iii) chemokine receptor binding. Many of these strategies have been tested in preclinical cancer models, and some have advanced to clinical trials as potential anticancer therapies. Here we will review the strategies and growing pharmacological toolbox for manipulating the chemokine system in cancer, and address novel methods poised for future (pre)clinical testing. Show less

Wang, X.; Jespers, W.; Prieto-Díaz, R.; Majellaro, M.; IJzerman, A.P.; Westen, G.J.P. van; ... ; Heitman, L.H. 2021

Identification of V6.51L as a selectivity hotspot in stereoselective A2B adenosine receptor antagonist recognition

Article / Letter to editor

open access

The four adenosine receptors (ARs) A1AR, A2AAR, A2BAR, and A3AR are G protein-coupled receptors (GPCRs) for which an exceptional amount of experimental and structural data is available. Still,... Show moreThe four adenosine receptors (ARs) A1AR, A2AAR, A2BAR, and A3AR are G protein-coupled receptors (GPCRs) for which an exceptional amount of experimental and structural data is available. Still, limited success has been achieved in getting new chemical modulators on the market. As such, there is a clear interest in the design of novel selective chemical entities for this family of receptors. In this work, we investigate the selective recognition of ISAM-140, a recently reported A2BAR reference antagonist. A combination of semipreparative chiral HPLC, circular dichroism and X-ray crystallography was used to separate and unequivocally assign the configuration of each enantiomer. Subsequently affinity evaluation for both A2A and A2B receptors demonstrate the stereospecific and selective recognition of (S)-ISAM140 to the A2BAR. The molecular modeling suggested that the structural determinants of this selectivity profile would be residue V2506.51 in A2BAR, which is a leucine in all other ARs including the closely related A2AAR. This was herein confirmed by radioligand binding assays and rigorous free energy perturbation (FEP) calculations performed on the L249V6.51 mutant A2AAR receptor. Taken together, this study provides further insights in the binding mode of these A2BAR antagonists, paving the way for future ligand optimization. Show less

Yang X., Dilweg M.A., Osemwengie D., Burggraaff L., Es D. van der, Heitman L.H., IJzerman A.P. 2020

Design and pharmacological profile of a novel covalent partial agonist for the adenosine A1 receptor

Article / Letter to editor

open access

Partial agonists for G protein-coupled receptors (GPCRs) provide opportunities for novel pharmacotherapies with enhanced on-target safety compared to full agonists. For the human adenosine A1... Show morePartial agonists for G protein-coupled receptors (GPCRs) provide opportunities for novel pharmacotherapies with enhanced on-target safety compared to full agonists. For the human adenosine A1 receptor (hA1AR) this has led to the discovery of capadenoson, which has been in phase IIa clinical trials for heart failure. Accordingly, the design and profiling of novel hA1AR partial agonists has become an important research focus. In this study, we report on LUF7746, a capadenoson derivative bearing an electrophilic fluorosulfonyl moiety, as an irreversibly binding hA1AR modulator. Meanwhile, a nonreactive ligand bearing a methylsulfonyl moiety, LUF7747, was designed as a control probe in our study.In a radioligand binding assay, LUF7746’s apparent affinity increased to nanomolar range with longer pre-incubation time, suggesting an increasing level of covalent binding over time. Moreover, compared to the reference full agonist CPA, LUF7746 was a partial agonist in a hA1AR-mediated G protein activation assay and resistant to blockade with an antagonist/inverse agonist. An in silico structure-based docking study combined with site-directed mutagenesis of the hA1AR demonstrated that amino acid Y2717.36 was the primary anchor point for the covalent interaction. Additionally, a label-free whole-cell assay was set up to identify LUF7746’s irreversible activation of an A1 receptor-mediated cell morphological response.These results led us to conclude that LUF7746 is a novel covalent hA1AR partial agonist and a valuable chemical probe for further mapping the receptor activation process. It may also serve as a prototype for a therapeutic approach in which a covalent partial agonist may cause less on-target side effects, conferring enhanced safety compared to a full agonist. Show less

Massink, A.; Amelia, T.; Karamychev, A.; IJzerman, A.P. 2019

Allosteric modulation of G protein-coupled receptors by amiloride and its derivatives. Perspectives for drug discovery?

Article / Letter to editor

open access

The function of G protein-coupled receptors (GPCRs) can be modulated by compounds that bind to other sites than the endogenous orthosteric binding site, so-called allosteric sites. Structure... Show moreThe function of G protein-coupled receptors (GPCRs) can be modulated by compounds that bind to other sites than the endogenous orthosteric binding site, so-called allosteric sites. Structure elucidation of a number of GPCRs has revealed the presence of a sodium ion bound in a conserved allosteric site. The small molecule amiloride and analogs thereof have been proposed to bind in this same sodium ion site. Hence, this review seeks to summarize and reflect on the current knowledge of allosteric effects by amiloride and its analogs on GPCRs. Amiloride is known to modulate adenosine, adrenergic, dopamine, chemokine, muscarinic, serotonin, gonadotropin-releasing hormone, GABA(B), and taste receptors. Amiloride analogs with lipophilic substituents tend to be more potent modulators than amiloride itself. Adenosine, alpha-adrenergic and dopamine receptors are most strongly modulated by amiloride analogs. In addition, for a few GPCRs, more than one binding site for amiloride has been postulated. Interestingly, the nature of the allosteric effect of amiloride and derivatives varies considerably between GPCRs, with both negative and positive allosteric modulation occurring. Since the sodium ion binding site is strongly conserved among class A GPCRs it is to be expected that amiloride also binds to class A GPCRs not evaluated yet. Investigating this typical amiloride-GPCR interaction further may yield general insight in the allosteric mechanisms of GPCR ligand binding and function, and possibly provide new opportunities for drug discovery. Show less

Ortiz Zacarías, N.V.; Lenselink, E.B.; IJzerman, A.P.; Handel, T.M.; Heitman, L.H. 2018

Intracellular Receptor Modulation: Novel Approach to Target GPCRs

Article / Letter to editor

open access

Recent crystal structures of multiple G protein-coupled receptors (GPCRs) have revealed a highly conserved intracellular pocket that can be used to modulate these receptors from the inside. This... Show moreRecent crystal structures of multiple G protein-coupled receptors (GPCRs) have revealed a highly conserved intracellular pocket that can be used to modulate these receptors from the inside. This novel intracellular site partially overlaps with the G protein and β-arrestin binding site, providing a new manner of pharmacological intervention. Here we provide an update of the architecture and function of the intracellular region of GPCRs, until now portrayed as the signaling domain. We review the available evidence on the presence of intracellular binding sites among chemokine receptors and other class A GPCRs, as well as different strategies to target it, including small molecules, pepducins, and nanobodies. Finally, the potential advantages of intracellular (allosteric) ligands over orthosteric ligands are also discussed. Show less

Yang, X.; Dong, G.; Michiels, T.J.M.; Lenselink, E.B.; Heitman, L.H.; Louvel, J.A.; IJzerman, A.P. 2017

A covalent antagonist for the human adenosine A_2A receptor

Article / Letter to editor

open access

The structure of the human A(2A) adenosine receptor has been elucidated by X-ray crystallography with a high affinity non-xanthine antagonist, ZM241385, bound to it. This template molecule served... Show moreThe structure of the human A(2A) adenosine receptor has been elucidated by X-ray crystallography with a high affinity non-xanthine antagonist, ZM241385, bound to it. This template molecule served as a starting point for the incorporation of reactive moieties that cause the ligand to covalently bind to the receptor. In particular, we incorporated a fluorosulfonyl moiety onto ZM241385, which yielded LUF7445 (4-((3-((7-amino-2-(furan-2-yl)-[1, 2, 4]triazolo[1,5-a][1, 3, 5]triazin-5-yl)amino)propyl)carbamoyl)benzene sulfonyl fluoride). In a radioligand binding assay, LUF7445 acted as a potent antagonist, with an apparent affinity for the hA(2A) receptor in the nanomolar range. Its apparent affinity increased with longer incubation time, suggesting an increasing level of covalent binding over time. An in silico A(2A)-structure-based docking model was used to study the binding mode of LUF7445. This led us to perform site-directed mutagenesis of the A(2A) receptor to probe and validate the target lysine amino acid K153 for covalent binding. Meanwhile, a functional assay combined with wash-out experiments was set up to investigate the efficacy of covalent binding of LUF7445. All these experiments led us to conclude LUF7445 is a valuable molecular tool for further investigating covalent interactions at this receptor. It may also serve as a prototype for a therapeutic approach in which a covalent antagonist may be needed to counteract prolonged and persistent presence of the endogenous ligand adenosine. Show less

Yang, X.; Dong, G.; Michiels, T.J.M.; Lenselink, E.B.; Heitman, L.H.; Louvel, J.A.; IJzerman, A.P. 2017

A covalent antagonist for the human adenosine A2A receptor

Article / Letter to editor

open access

A covalent antagonist for the human adenosine A2A receptor Xue Yang, Guo Dong, Thomas J.M. Michiels, Eelke B. Lenselink, Laura Heitman, Julien Louvel, Ad P. IJzerman Abstract The structure of the... Show moreA covalent antagonist for the human adenosine A2A receptor Xue Yang, Guo Dong, Thomas J.M. Michiels, Eelke B. Lenselink, Laura Heitman, Julien Louvel, Ad P. IJzerman Abstract The structure of the human A2A adenosine receptor has been elucidated by X-ray crystallography with a high affinity non-xanthine antagonist, ZM241385, bound to it. This template molecule served as a starting point for the incorporation of reactive moieties that cause the ligand to covalently bind to the receptor. In particular, we incorporated a fluorosulfonyl moiety onto ZM241385, which yielded LUF7445 (4-((3-((7-amino-2-(furan-2-yl)-[1, 2, 4]triazolo[1,5-a][1, 3, 5]triazin-5-yl)amino)propyl)carbamoyl)benzene sulfonyl fluoride). In a radioligand binding assay, LUF7445 acted as a potent antagonist, with an apparent affinity for the hA2A receptor in the nanomolar range. Its apparent affinity increased with longer incubation time, suggesting an increasing level of covalent binding over time. An in silico A2A-structure-based docking model was used to study the binding mode of LUF7445. This led us to perform site-directed mutagenesis of the A2A receptor to probe and validate the target lysine amino acid K153 for covalent binding. Meanwhile, a functional assay combined with wash-out experiments was set up to investigate the efficacy of covalent binding of LUF7445. All these experiments led us to conclude LUF7445 is a valuable molecular tool for further investigating covalent interactions at this receptor. It may also serve as a prototype for a therapeutic approach in which a covalent antagonist may be needed to counteract prolonged and persistent presence of the endogenous ligand adenosine. Show less

Yang, X.; Dong, G.; Michiels, T.J.M.; Lenselink, E.B.; Heitman, L.; Louvel, J.A.; IJzerman, A.P. 2017

A covalent antagonist for the human adenosine A(2A) receptor

Article / Letter to editor

open access

Yang, X.; Dong, G.; Michiels, T.J.M.; Lenselink, E.B.; Heitman, L.H.; Louvel, J.A.; IJzerman, A.P. 2017

A covalent antagonist for the human adenosine A2A receptor

Article / Letter to editor

open access

The structure of the human A2A adenosine receptor has been elucidated by X-ray crystallography with a high affinity non-xanthine antagonist, ZM241385, bound to it. This template molecule served as... Show moreThe structure of the human A2A adenosine receptor has been elucidated by X-ray crystallography with a high affinity non-xanthine antagonist, ZM241385, bound to it. This template molecule served as a starting point for the incorporation of reactive moieties that cause the ligand to covalently bind to the receptor. In particular, we incorporated a fluorosulfonyl moiety onto ZM241385, which yielded LUF7445 (4-((3-((7-amino-2-(furan-2-yl)-[1, 2, 4]triazolo[1,5-a][1, 3, 5]triazin-5-yl)amino)propyl)carbamoyl)benzene sulfonyl fluoride). In a radioligand binding assay, LUF7445 acted as a potent antagonist, with an apparent affinity for the hA2A receptor in the nanomolar range. Its apparent affinity increased with longer incubation time, suggesting an increasing level of covalent binding over time. An in silico A2A-structure-based docking model was used to study the binding mode of LUF7445. This led us to perform site-directed mutagenesis of the A2A receptor to probe and validate the target lysine amino acid K153 for covalent binding. Meanwhile, a functional assay combined with wash-out experiments was set up to investigate the efficacy of covalent binding of LUF7445. All these experiments led us to conclude LUF7445 is a valuable molecular tool for further investigating covalent interactions at this receptor. It may also serve as a prototype for a therapeutic approach in which a covalent antagonist may be needed to counteract prolonged and persistent presence of the endogenous ligand adenosine.KEYWORDS: A2A adenosine receptor; Adenosine; Covalent antagonist; G protein-coupled receptors; Radioligand binding Show less

Hillger, J.M.; Diehl, C.; Spronsen, E. van; Boomsma, D.I.; Slagboom, P.E.; Heitman, L.H.; IJzerman, A.P. 2016

Getting personal: Endogenous adenosine receptor signaling in lymphoblastoid cell lines

Article / Letter to editor

open access

Getting personal: Endogenous adenosine receptor signaling in lymphoblastoid cell lines J.M.Hillger, C.Diehl, E.van Spronsen, D.I.Boomsma, P.E.Slagboom, L.H.Heitman, A.P.IJzerman Division of... Show moreGetting personal: Endogenous adenosine receptor signaling in lymphoblastoid cell lines J.M.Hillger, C.Diehl, E.van Spronsen, D.I.Boomsma, P.E.Slagboom, L.H.Heitman, A.P.IJzerman Division of Medicinal Chemistry, LACDR, Leiden University, The Netherlands Department of Biological Psychology, VU University Amsterdam, The Netherlands Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, The Netherlands Abstract Genetic differences between individuals that affect drug action form a challenge in drug therapy. Many drugs target G protein-coupled receptors (GPCRs), and a number of receptor variants have been noted to impact drug efficacy. This, however, has never been addressed in a systematic way, and, hence, we studied real-life genetic variation of receptor function in personalized cell lines. As a showcase we studied adenosine A2A receptor (A2AR) signaling in lymphoblastoid cell lines (LCLs) derived from a family of four from the Netherlands Twin Register (NTR), using a non-invasive label-free cellular assay. The potency of a partial agonist differed significantly for one individual. Genotype comparison revealed differences in two intron SNPs including rs2236624, which has been associated with caffeine-induced sleep disorders. While further validation is needed to confirm genotype-specific effects, this set-up clearly demonstrated that LCLs are a suitable model system to study genetic influences on A2AR response in particular and GPCR responses in general. Graphical abstract Show less

Hillger, J.M.; Diehl, C.; Spronsen, E. van; Boomsma, D.I.; Slagboom, P.E.; Heitman, L.H.; Ijzerman, A.P. 2016

Getting personal: Endogenous adenosine receptor signaling in lymphoblastoid cell lines

Article / Letter to editor

metadata only

Zweemer, A.J.M. 2014

The ins and outs of ligand binding to CCR2

Doctoral Thesis

open access

This thesis provides novel insights in the molecular mechanism of action of antagonists for the chemokine receptor CCR2. CCR2 belongs to the protein family of G protein-coupled receptors (GPCRs).... Show moreThis thesis provides novel insights in the molecular mechanism of action of antagonists for the chemokine receptor CCR2. CCR2 belongs to the protein family of G protein-coupled receptors (GPCRs). It is involved in several inflammatory diseases and therefore many small molecule antagonists targeting this receptor have been developed over the years. Unfortunately all clinical candidates tested so far appeared to lack efficacy in man, which stresses the need for a better understanding of their mechanism of action. This thesis revealed three separate binding pockets throughout the transmembrane receptor domain via which CCR2 can be pharmacologically modulated. Different routes towards insurmountable antagonism of CCR2 were described, either via noncompetitive or via long residence time antagonists. These results may allow a more rational design of future antagonists, and are equally important to understand the outcomes of studies with existing CCR2 antagonists. In concert with the currently expanding insight in the structure and signalling capacities of GPCRs, the data presented in this thesis allow to better fine-tune the pharmacological modulation of the chemokine receptor CCR2, and GPCRs in general Show less

Hoogendoorn, S. 2014

A chemical biology approach for targeting of ligand-drug conjugates

Doctoral Thesis

open access

Cells express a large array of membrane receptors on their surface that function as a communication channel between the extra- and intracellular environment of the cell. Ligands for these receptors... Show moreCells express a large array of membrane receptors on their surface that function as a communication channel between the extra- and intracellular environment of the cell. Ligands for these receptors span a wide range of biomolecules, from proteins to carbohydrates to small molecules. Some receptors are continuously recycling between the membrane and the inside of a cell, whereas others are in a steady-state at the membrane and need ligand binding for their activation and subsequent internalization. Synthetic molecules that bind to these membrane receptors can be used to either modulate their function, or to target a reporter group (i.e. a fluorescent dye) and/or a bio-active compound (drug, protein) to cells that express this receptor, ensuring delivery to a specific cell-type. The research described in this Thesis combines synthetic and biochemical methodologies to create ligands that interact selectively with membrane receptors of the GPCR and lectin-binding families. Attachment of synthetic probes, proteins or cytostatic molecules to these ligands by a variety of chemical and enzymatic methods ensured their uptake exclusively into cells that expressed the receptor of interest. Visualization of this process was enabled by the incorporation of a fluorescent dye into the final constructs. Show less

Horst, E. van der 2012

Drugs, structures, fragments : substructure-based approaches to GPCR drug discovery and design

Doctoral Thesis

open access

This thesis is all about cheminformatics, and its impact on drug discovery. A number of strategies are discussed that apply computational methods for the analysis and design of G protein-coupled... Show moreThis thesis is all about cheminformatics, and its impact on drug discovery. A number of strategies are discussed that apply computational methods for the analysis and design of G protein-coupled receptor (GPCR) ligands. Frequent substructure mining is applied to find the common structural motifs that are discriminative for predefined classes of GPCR ligands. In addtion, this approach is extended to cluster GPCRs to suggest a new classification for this receptor superfamily. Furthermore, substructure analysis is utilised to screen for new adenosine A2A receptor ligands. Finally, an automated de novo design approach is described that is used for the design of new adenosine A1 receptor ligands using a multi-objective evolutionary algorithm. Show less

Ye, K. 2008

Novel algorithms for protein sequence analysis

Doctoral Thesis

open access

Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology__s paradigm is that this order of amino acids determines the protein__s... Show moreEach protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology__s paradigm is that this order of amino acids determines the protein__s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1 begins with the introduction of amino acids, proteins and protein families. Then fundamental techniques from computer science related to the thesis are briefly described. Making a multiple sequence alignment (MSA) and constructing a phylogenetic tree are traditional means of sequence analysis. Information entropy, feature selection and sequential pattern mining provide alternative ways to analyze protein sequences and they are all from computer science. In Chapter 2, information entropy was used to measure the conservation on a given position of the alignment. From an alignment which is grouped into subfamilies, two types of information entropy values are calculated for each position in the MSA. One is the average entropy for a given position among the subfamilies, the other is the entropy for the same position in the entire multiple sequence alignment. This so-called two-entropies analysis or TEA in short, yields a scatter-plot in which all positions are represented with their two entropy values as x- and y-coordinates. The different locations of the positions (or dots) in the scatter-plot are indicative of various conservation patterns and may suggest different biological functions. The globally conserved positions show up at the lower left corner of the graph, which suggests that these positions may be essential for the folding or for the main functions of the protein superfamily. In contrast the positions neither conserved between subfamilies nor conserved in each individual subfamily appear at the upper right corner. The positions conserved within each subfamily but divergent among subfamilies are in the upper left corner. They may participate in biological functions that divide subfamilies, such as recognition of an endogenous ligand in G protein-coupled receptors. The TEA method requires a definition of protein subfamilies as an input. However such definition is a challenging problem by itself, particularly because this definition is crucial for the following prediction of specificity positions. In Chapter 3, we automated the TEA method described in Chapter 2 by tracing the evolutionary pressure from the root to the branches of the phylogenetic tree. At each level of the tree, a TEA plot is produced to capture the signal of the evolutionary pressure. A consensus TEA-O plot is composed from the whole series of plots to provide a condensed representation. Positions related to functions that evolved early (conserved) or later (specificity) are close to the lower left or upper left corner of the TEA-O plot, respectively. This novel approach allows an unbiased, user-independent, analysis of residue relevance in a protein family. We tested the TEA-O method on a synthetic dataset as well as on __real__ data, i.e., LacI and GPCR datasets. The ROC plots for the real data showed that TEA-O works perfectly well on all datasets and much better than other considered methods such as evolutionary trace, SDPpred and TreeDet. While positions were treated independently from each other in Chapter 2 and 3 in predicting specificity positions, in Chapter 4 multi-RELIEF considers both sequence similarity and distance in 3D structure in the specificity scoring function. The multi-RELIEF method was developed based on RELIEF, a state-of-the-art Machine-Learning technique for feature weighting. It estimates the expected __local__ functional specificity of residues from an alignment divided in multiple classes. Optionally, 3D structure information is exploited by increasing the weight of residues that have high-weight neighbors. Using ROC curves over a large body of experimental reference data, we showed that multi-RELIEF identifies specificity residues for the seven test sets used. In addition, incorporating structural information improved the prediction for specificity of interaction with small molecules. Comparison of multi-RELIEF with four other state-of-the-art algorithms indicates its robustness and best overall performance. In Chapter 2, 3 and 4, we heavily relied on multiple sequence alignment to identify conserved and specificity positions. As mentioned before, the construction of such alignment is not self-evident. Following the principle of sequential pattern mining, in Chapter 5, we proposed a new algorithm that directly identifies frequent biologically meaningful patterns from unaligned sequences. Six algorithms were designed and implemented to mine three different pattern types from either one or two datasets using a pattern growth approach. We compared our approach to PRATT2 and TEIRESIAS in efficiency, completeness and the diversity of pattern types. Compared to PRATT2, our approach is faster, capable of processing large datasets and able to identify the so-called type III patterns. Our approach is comparable to TEIRESIAS in the discovery of the so-called type I patterns but has additional functionality such as mining the so-called type II and type III patterns and finding discriminating patterns between two datasets. From Chapter 2 to 5, we aimed to identify functional residues from either aligned or unaligned protein sequences. In Chapter 6, we introduce an alignment-independent procedure to cluster protein sequences, which may be used to predict protein function. Traditionally phylogeny reconstruction is usually based on multiple sequence alignment. The procedure can be computationally intensive and often requires manual adjustment, which may be particularly difficult for a set of deviating sequences. In cheminformatics, constructing a similarity tree of ligands is usually alignment free. Feature spaces are routine means to convert compounds into binary fingerprints. Then distances among compounds can be obtained and similarity trees are constructed via clustering techniques. We explored building feature spaces for phylogeny reconstruction either using the so-called k-mer method or via sequential pattern mining with additional filtering and combining operations. Satisfying trees were built from both approaches compared with alignment-based methods. We found that when k equals 3, the phylogenetic tree built from the k-mer fingerprints is as good as one of the alignment-based methods, in which PAM and Neighborhood joining are used for computing distance and constructing a tree, respectively (NJ-PAM). As for the sequential pattern mining approach, the quality of the phylogenetic tree is better than one of the alignment-based method (NJ-PAM), if we set the support value to 10% and used maximum patterns only as descriptors. Finally in Chapter 7, general conclusions about the research described in this thesis are drawn. They are supplemented with an outlook on further research lines. We are convinced that the described algorithms can be useful in, e.g., genomic analyses, and provide further ideas for novel algorithms in this respect. Show less

Leiden University Scholarly Publications

Your Search

Enabled Filters

Sort Options

Refine Results

Resource Type

Availability

Creation Date

Faculty

Collection

Topic

Author

Language

Search results