This thesis is dedicated to the empirical study of image analysis in HT/HC screen study. Often a HT/HC screening produces extensive amounts that cannot be manually analyzed. Thus, an automated... Show moreThis thesis is dedicated to the empirical study of image analysis in HT/HC screen study. Often a HT/HC screening produces extensive amounts that cannot be manually analyzed. Thus, an automated image analysis solution is prior to an objective understanding of the raw image data. Compared to general application domain, the efficiency of HT/HC image analysis is highly subjected to image quantity and quality. Accordingly, this thesis will address two major procedures, namely image segmentation and object tracking, in the image analysis step of HT/HC screen study. Moreover, this thesis focuses on expending generic computer science and machine learning theorems into the design of dedicated algorithms for HT/HC image analysis. Additionally, this thesis exemplifies a practical implementation of image analysis and data analysis workflow via empirical case studies with different image modalities and experiment settings. However, the data analysis theorem will be generally illustrated without further expansions. Finally, the thesis will briefly address supplementary infrastructures for end-user interaction and data visualization. Show less
The tuning of learning algorithm parameters has become more and more important during the last years. With the fast growth of computational power and available memory databases have grown... Show moreThe tuning of learning algorithm parameters has become more and more important during the last years. With the fast growth of computational power and available memory databases have grown dramatically. This is very challenging for the tuning of parameters arising in machine learning, since the training can become very time-consuming for large datasets. For this reason efficient tuning methods are required, which are able to improve the predictions of the learning algorithms. In this thesis we incorporate model-assisted optimization techniques, for performing efficient optimization on noisy datasets with very limited budgets. Under this umbrella we also combine learning algorithms with methods for feature construction and selection. We propose to integrate a variety of elements into the learning process. E.g., can tuning be helpful in learning tasks like time series regression using state-of-the-art machine learning algorithms? Are statistical methods capable to reduce noise e ffects? Can surrogate models like Kriging learn a reasonable mapping of the parameter landscape to the quality measures, or are they deteriorated by disturbing factors? Summarizing all these parts, we analyze if superior learning algorithms can be created, with a special focus on efficient runtimes. Besides the advantages of systematic tuning approaches, we also highlight possible obstacles and issues of tuning. Di fferent tuning methods are compared and the impact of their features are exposed. It is a goal of this work to give users insights into applying state-of-the-art learning algorithms profitably in practice Show less
Over the last decades several disciplines relevant to medicinal chemistry and preclinical drug discovery have made gigantic leaps; this includes chemistry, biology and measurement of bioactivity.... Show moreOver the last decades several disciplines relevant to medicinal chemistry and preclinical drug discovery have made gigantic leaps; this includes chemistry, biology and measurement of bioactivity. Better techniques have led to massive amounts of data. Moreover, sources of chemical and bioactivity data have become available in the public domain. Hence there is a need for new techniques combining and mining these data sources. This thesis focuses on computational methods combining data from these disciplines and demonstrates that the sum of these methods leads to better quality predictions than models using the individual data sources. One of the techniques central in this thesis is proteochemometric modeling, a machine learning approach linking chemical descriptors and protein descriptors to a biologically relevant output variable. This output variable describes the activity of molecules on biological macromolecules and hence proteochemometric models can make relevant predictions for both unseen molecules and unseen macromolecules (e.g. novel viral mutants). Secondly we present a novel technique that is able to combine information from multiple crystal structures in such a way that shared and unique pharmacophoric features can be isolated and visualized. Approaches presented here have been validated prospectively and have been shown to be widely applicable. Show less