This paper surveys the field of deep multiagent reinforcement learning (RL). The combination of deep neural networks with RL has gained increased traction in recent years and is slowly shifting the... Show moreThis paper surveys the field of deep multiagent reinforcement learning (RL). The combination of deep neural networks with RL has gained increased traction in recent years and is slowly shifting the focus from single-agent to multiagent environments. Dealing with multiple agents is inherently more complex as (a) the future rewards depend on multiple players' joint actions and (b) the computational complexity increases. We present the most common multiagent problem representations and their main challenges, and identify five research areas that address one or more of these challenges: centralised training and decentralised execution, opponent modelling, communication, efficient coordination, and reward shaping. We find that many computational studies rely on unrealistic assumptions or are not generalisable to other settings; they struggle to overcome the curse of dimensionality or nonstationarity. Approaches from psychology and sociology capture promising relevant behaviours, such as communication and coordination, to help agents achieve better performance in multiagent settings. We suggest that, for multiagent RL to be successful, future research should address these challenges with an interdisciplinary approach to open up new possibilities in multiagent RL. Show less
Automated Machine Learning (AutoML) frameworks are designed to select the optimal combination of operators and hyperparameters. Classical AutoML-based Bayesian Optimization (BO) approaches often... Show moreAutomated Machine Learning (AutoML) frameworks are designed to select the optimal combination of operators and hyperparameters. Classical AutoML-based Bayesian Optimization (BO) approaches often integrate all operator search spaces into a single search space. However, a disadvantage of this history-based strategy is that it can be less robust when initialized randomly than optimizing each operator algorithm combination independently. To overcome this issue, a novel contesting procedure algorithm, Divide And Conquer Optimization (DACOpt), is proposed to make AutoML more robust. DACOpt partitions the AutoML search space into a reasonable number of sub-spaces based on algorithm similarity and budget constraints. Furthermore, throughout the optimization process, DACOpt allocates resources to each sub-space to ensure that (1) all areas of the search space are covered and (2) more resources are assigned to the most promising sub-space. Two extensive sets of experiments on 117 benchmark datasets demonstrate that DACOpt achieves significantly better results in 36% of AutoML benchmark datasets: 5% when to compared to TPOT, 8% - to AutoSklearn, 15% - to H20 and 18% - to ATM. Show less
As combinatorial optimization is one of the main quantum computing applications, many methods based on parameterized quantum circuits are being developed. In general, a set of parameters are being... Show moreAs combinatorial optimization is one of the main quantum computing applications, many methods based on parameterized quantum circuits are being developed. In general, a set of parameters are being tweaked to optimize a cost function out of the quantum circuit output. One of these algorithms, the Quantum Approximate Optimization Algorithm stands out as a promising approach to tackling combinatorial problems. However, finding the appropriate parameters is a difficult task. Although QAOA exhibits concentration properties, they can depend on instances characteristics that may not be easy to identify, but may nonetheless offer useful information to find good parameters. In this work, we study unsupervised Machine Learning approaches for setting these parameters without optimization. We perform clustering with the angle values but also instances encodings (using instance features or the output of a variational graph autoencoder), and compare different approaches. These angle-finding strategies can be used to reduce calls to quantum circuits when leveraging QAOA as a subroutine. We showcase them within Recursive-QAOA up to depth 3 where the number of QAOA parameters used per iteration is limited to 3, achieving a median approximation ratio of 0.94 for MaxCut over 200 Erdős-Rényi graphs. We obtain similar performances to the case where we extensively optimize the angles, hence saving numerous circuit calls. Show less
Automated machine learning (AutoML) aims to automatically produce the best machine learning pipeline, i.e., a sequence of operators and their optimized hyperparameter settings, to maximize the... Show moreAutomated machine learning (AutoML) aims to automatically produce the best machine learning pipeline, i.e., a sequence of operators and their optimized hyperparameter settings, to maximize the performance of an arbitrary machine learning problem. Typically, AutoML based Bayesian optimization (BO) approaches convert the AutoML optimization problem into a Hyperparameter Optimization (HPO) problem, where the choice of algorithms is modeled as an additional categorical hyperparameter. In this way, algorithms and their local hyper-parameters are referred to as the same level. Consequently, this approach makes the resulting initial sampling less robust. In this study, we describe a first attempt to formulate the AutoML optimization problem as its nature instead of transfer it into a HPO problem. To take advantage of this paradigm, we propose a novel initial sampling approach to maximize the coverage of the AutoML search space to help BO construct a robust surrogate model. We experiment with 2 independent scenarios of AutoML with 2 operators and 6 operators over 117 benchmark datasets. Results of our experiments demonstrate that the performance of BO significantly improved by using our sampling approach. Show less
Combinatorial optimization is an important application targeted by quantum computing. However, near-term hardware constraints make quantum algorithms unlikely to be competitive when compared to... Show moreCombinatorial optimization is an important application targeted by quantum computing. However, near-term hardware constraints make quantum algorithms unlikely to be competitive when compared to high-performing classical heuristics on large practical problems. One option to achieve advantages with near-term devices is to use them in combination with classical heuristics. In particular, we propose using quantum methods to sample from classically intractable distributions -- which is the most probable approach to attain a true provable quantum separation in the near-term -- which are used to solve optimization problems faster. We numerically study this enhancement by an adaptation of Tabu Search using the Quantum Approximate Optimization Algorithm (QAOA) as a neighborhood sampler. We show that QAOA provides a flexible tool for exploration-exploitation in such hybrid settings and can provide evidence that it can help in solving problems faster by saving many tabu iterations and achieving better solutions. Show less