This thesis focuses on data found in the field of computational drug discovery. New insight can be obtained by applying machine learning in various ways and in a variety of domains. Two studies... Show moreThis thesis focuses on data found in the field of computational drug discovery. New insight can be obtained by applying machine learning in various ways and in a variety of domains. Two studies delved into the application of proteochemometrics (PCM), a machine learning technique that can be used to find relations in protein-ligand bioactivity data and then predict using a virtual screen whether compounds that had never been tested on a particular protein, or set of proteins. With this, sets of compounds were suggested for experimental validation that were significant in a myriad of ways. Another study investigated the mutational patterns in cancer, applying a large dataset of mutation data and identifying several motifs in G protein-coupled receptors. The thesis also contains the work done on the Papyrus dataset, a large scale bioactivity dataset that focuses on standardising data for computational drug discovery and providing an out-of-the-box set that can be used in a variety of settings. Show less
This thesis describes the importance of being able to control the selectivity of potential drug candidates. It explains how computational models are employed to predict and rationalize compound... Show moreThis thesis describes the importance of being able to control the selectivity of potential drug candidates. It explains how computational models are employed to predict and rationalize compound-protein binding (affinity) and therewith, selectivity of compounds. Moreover, it shows that selectivity can purposely be tuned to target either a single protein or an entire panel of proteins. The challenges of selectivity modeling are addressed based on case studies in the sodium-dependent glucose co-transporters, G protein-coupled receptors, and kinases. Show less
Burggraaff, L.; Oranje, P.; Gouka, R.; Pijl, P. van der; Geldof, M.; Vlijmen, H.W.T. van; ... ; Westen, G.J.P. van 2019
Over the last decades several disciplines relevant to medicinal chemistry and preclinical drug discovery have made gigantic leaps; this includes chemistry, biology and measurement of bioactivity.... Show moreOver the last decades several disciplines relevant to medicinal chemistry and preclinical drug discovery have made gigantic leaps; this includes chemistry, biology and measurement of bioactivity. Better techniques have led to massive amounts of data. Moreover, sources of chemical and bioactivity data have become available in the public domain. Hence there is a need for new techniques combining and mining these data sources. This thesis focuses on computational methods combining data from these disciplines and demonstrates that the sum of these methods leads to better quality predictions than models using the individual data sources. One of the techniques central in this thesis is proteochemometric modeling, a machine learning approach linking chemical descriptors and protein descriptors to a biologically relevant output variable. This output variable describes the activity of molecules on biological macromolecules and hence proteochemometric models can make relevant predictions for both unseen molecules and unseen macromolecules (e.g. novel viral mutants). Secondly we present a novel technique that is able to combine information from multiple crystal structures in such a way that shared and unique pharmacophoric features can be isolated and visualized. Approaches presented here have been validated prospectively and have been shown to be widely applicable. Show less