In deep reinforcement learning, searching and learning techniques are two important components. They can be used independently and in combination to deal with different problems in AI. These... Show moreIn deep reinforcement learning, searching and learning techniques are two important components. They can be used independently and in combination to deal with different problems in AI. These results have inspired research into artificial general intelligence (AGI).We study table based classic Q-learning on the General Game Playing (GGP) system, showing that classic Q-learning works on GGP, although convergence is slow, and it is computationally expensive to learn complex games.This dissertation uses an AlphaZero-like self-play framework to explore AGI on small games. By tuning different hyper-parameters, the role, effects and contributions of searching and learning are studied. A further experiment shows that search techniques can contribute as experts to generate better training examples to speed up the start phase of training.In order to extend the AlphaZero-likeself-play approach to single player complex games, the Morpion Solitaire game is implemented by combining Ranked Reward method. Our first AlphaZero-based approach is able to achieve a near human best record.Overall, in this thesis, both searching and learning techniques are studied (by themselves and in combination) in GGP and AlphaZero-like self-play systems. We do so for the purpose of making steps towards artificial general intelligence, towards systems that exhibit intelligent behavior in more than one domain. Show less
Introducing new algorithmic ideas is a key part of the continuous improvement of existing optimization algorithms. However, when introducing a new component into an existing algorithm, assessing... Show moreIntroducing new algorithmic ideas is a key part of the continuous improvement of existing optimization algorithms. However, when introducing a new component into an existing algorithm, assessing its potential benefits is a challenging task. Often, the component is added to a default implementation of the underlying algorithm and compared against a limited set of other variants. This assessment ignores any potential interplay with other algorithmic ideas that share the same base algorithm, which is critical in understanding the exact contributions being made. We explore a more extensive procedure, which uses hyperparameter tuning as a means of assessing the benefits of new algorithmic components. This allows for a more robust analysis by not only focusing on the impact on performance, but also by investigating how this performance is achieved. We implement our suggestion in the context of the Modular CMA-ES framework, which was redesigned and extended to include some new modules and several new options for existing modules, mostly focused on the step-size adaptation method. Our analysis highlights the differences between these new modules, and identifies the situations in which they have the largest contribution. Show less
Structural Bias (SB) is an important type of algorithmic deficiency within iterative optimisation heuristics. However, methods for detecting structural bias have not yet fully matured, and recent... Show moreStructural Bias (SB) is an important type of algorithmic deficiency within iterative optimisation heuristics. However, methods for detecting structural bias have not yet fully matured, and recent studies have uncovered many interesting questions. One of these is the question of how structural bias can be related to anisotropy. Intuitively, an algorithm that is not isotropic would be considered structurally biased. However, there have been cases where algorithms appear to only show SB in some dimensions. As such, we investigate whether these algorithms actually exhibit anisotropy, and how this impacts the detection of SB. We find that anisotropy is very rare, and even in cases where it is present, there are clear tests for SB which do not rely on any assumptions of isotropy, so we can safely expand the suite of SB tests to encompass these kinds of deficiencies not found by the original tests.We propose several additional testing procedures for SB detection and aim to motivate further research into the creation of a robust portfolio of tests. This is crucial since no single test will be able to work effectively with all types of SB we identify. Show less
A new acquisition function is proposed for solving robust optimization problems via Bayesian Optimization. The proposed acquisition function reflects the need for the robust instead of the nominal... Show moreA new acquisition function is proposed for solving robust optimization problems via Bayesian Optimization. The proposed acquisition function reflects the need for the robust instead of the nominal optimum, and is based on the intuition of utilizing the higher moments of the improvement. The efficacy of Bayesian Optimization based on this acquisition function is demonstrated on four test problems, each affected by three different levels of noise. Our findings suggest the promising nature of the proposed acquisition function as it yields a better robust optimal value of the function in 6/12 test scenarios when compared with the baseline. Show less
The imbalanced classification problem is very relevant in both academic and industrial applications. The task of finding the best machine learning model to use for a specific imbalanced dataset is... Show moreThe imbalanced classification problem is very relevant in both academic and industrial applications. The task of finding the best machine learning model to use for a specific imbalanced dataset is complicated due to a large number of existing algorithms, each with its own hyperparameters. The Combined Algorithm Selection and Hyperparameter optimization (CASH) has been introduced to tackle both aspects at the same time. However, CASH has not been studied in detail in the class imbalance domain, where the best combination of resampling technique and classification algorithm is searched for, together with their optimized hyperparameters. Thus, we target the CASH problem for imbalanced classification. We experiment with a search space of 5 classification algorithms, 21 resampling approaches and 64 relevant hyperparameters in total. Moreover, we investigate performance of 2 well-known optimization approaches: Random search and Tree Parzen Estimators approach which is a kind of Bayesian optimization. For comparison, we also perform grid search on all combinations of resampling techniques and classification algorithms with their default hyperparameters. Our experimental results show that a Bayesian optimization approach outperforms the other approaches for CASH in this application domain. Show less
This paper investigates whether optimisation methods with the population made up of one solution can suffer from structural bias just like their multisolution variants. Following recent results... Show moreThis paper investigates whether optimisation methods with the population made up of one solution can suffer from structural bias just like their multisolution variants. Following recent results highlighting the importance of choice of strategy for handling solutions generated outside the domain, a selection of single solution methods are considered in conjunction with several such strategies. Obtained results are tested for the presence of structural bias by means of a traditional approach from literature and a newly proposed here statistical approach. These two tests are demonstrated to be not fully consistent. All tested methods are found to be structurally biased with at least one of the tested strategies. Confirming results for multisolution methods, it is such strategy that is shown to control the emergence of structural bias in single solution methods. Some of the tested methods exhibit a kind of structural bias that has not been observed before. Show less
Combinatorial optimization is an important application targeted by quantum computing. However, near-term hardware constraints make quantum algorithms unlikely to be competitive when compared to... Show moreCombinatorial optimization is an important application targeted by quantum computing. However, near-term hardware constraints make quantum algorithms unlikely to be competitive when compared to high-performing classical heuristics on large practical problems. One option to achieve advantages with near-term devices is to use them in combination with classical heuristics. In particular, we propose using quantum methods to sample from classically intractable distributions -- which is the most probable approach to attain a true provable quantum separation in the near-term -- which are used to solve optimization problems faster. We numerically study this enhancement by an adaptation of Tabu Search using the Quantum Approximate Optimization Algorithm (QAOA) as a neighborhood sampler. We show that QAOA provides a flexible tool for exploration-exploitation in such hybrid settings and can provide evidence that it can help in solving problems faster by saving many tabu iterations and achieving better solutions. Show less
When faced with a specific optimization problem, choosing which algorithm to use is always a tough task. Not only is there a vast variety of algorithms to select from, but these algorithms often... Show moreWhen faced with a specific optimization problem, choosing which algorithm to use is always a tough task. Not only is there a vast variety of algorithms to select from, but these algorithms often are controlled by many hyperparameters, which need to be tuned in order to achieve the best performance possible. Usually, this problem is separated into two parts: algorithm selection and algorithm configuration. With the significant advances made in Machine Learning, however, these problems can be integrated into a combined algorithm selection and hyperparameter optimization task, commonly known as the CASH problem. In this work we compare sequential and integrated algorithm selection and configuration approaches for the case of selecting and tuning the best out of 4608 variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) tested on the Black Box Optimization Benchmark (BBOB) suite. We first show that the ranking of the modular CMA-ES variants depends to a large extent on the quality of the hyperparameters. This implies that even a sequential approach based on complete enumeration of the algorithm space will likely result in sub-optimal solutions. In fact, we show that the integrated approach manages to provide competitive results at a much smaller computational cost. We also compare two different mixed-integer algorithm configuration techniques, called irace and Mixed-Integer Parallel Efficient Global Optimization (MIP-EGO). While we show that the two methods differ significantly in their treatment of the exploration-exploitation balance, their overall performances are very similar. Show less
After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However,... Show moreAfter the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. We find that Q-learning converges to a high win rate in GGP. For the ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ-greedy strategy, we propose a first enhancement, the dynamic ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ algorithm. In addition, inspired by (Gelly & Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games. Show less
Kriging or Gaussian Process Regression is applied in many fields as a non-linear regression model as well as a surrogate model in the field of evolutionary computation. However, the computational... Show moreKriging or Gaussian Process Regression is applied in many fields as a non-linear regression model as well as a surrogate model in the field of evolutionary computation. However, the computational and space complexity of Kriging, that is cubic and quadratic in the number of data points respectively, becomes a major bottleneck with more and more data available nowadays. In this paper, we propose a general methodology for the complexity reduction, called cluster Kriging, where the whole data set is partitioned into smaller clusters and multiple Kriging models are built on top of them. In addition, four Kriging approximation algorithms are proposed as candidate algorithms within the new framework. Each of these algorithms can be applied to much larger data sets while maintaining the advantages and power of Kriging. The proposed algorithms are explained in detail and compared empirically against a broad set of existing state-of-the-art Kriging approximation methods on a well-defined testing framework. According to the empirical study, the proposed algorithms consistently outperform the existing algorithms. Moreover, some practical suggestions are provided for using the proposed algorithms. Show less
Although the class-imbalance classification problem has caught a huge amount of attention, hyperparameter optimisation has not been studied in detail in this field. Both classification algorithms... Show moreAlthough the class-imbalance classification problem has caught a huge amount of attention, hyperparameter optimisation has not been studied in detail in this field. Both classification algorithms and resampling techniques involve some hyperparameters that can be tuned. This paper sets up several experiments and draws the conclusion that, compared to using default hyperparameters, applying hyperparameter optimisation for both classification algorithms and resampling approaches can produce the best results for classifying the imbalanced datasets. Moreover, this paper shows that data complexity, especially the overlap between classes, has a big impact on the potential improvement that can be achieved through hyperparameter optimisation. Results of our experiments also indicate that using resampling techniques cannot improve the performance for some complex datasets, which further emphasizes the importance of analyzing data complexity before dealing with imbalanced datasets. Show less