The paper introduces Hyperband, a novel bandit-based approach to hyperparameter optimization. Hyperparameter optimization is crucial for the performance of machine learning algorithms, but it is challenging due to the large number of hyperparameters and the difficulty in selecting the optimal configuration. While Bayesian optimization methods have been successful in configuration selection, they are not designed to speed up the evaluation of configurations. Hyperband addresses this gap by formulating hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem, where resources are allocated adaptively to randomly sampled configurations. The algorithm, called HYPERBAND, uses a principled early-stopping strategy to allocate resources, allowing it to evaluate orders of magnitude more configurations than black-box procedures like Bayesian optimization methods. The paper provides a theoretical analysis of HYPERBAND, demonstrating its ability to adapt to unknown convergence rates and validation loss behavior. Empirical results show that HYPERBAND can provide over an order-of-magnitude speedup over popular Bayesian optimization methods on various deep-learning and kernel-based learning problems. The paper also discusses the application of HYPERBAND to different types of resources, such as time, data set subsampling, and feature subsampling, and provides guidelines for practical deployment.The paper introduces Hyperband, a novel bandit-based approach to hyperparameter optimization. Hyperparameter optimization is crucial for the performance of machine learning algorithms, but it is challenging due to the large number of hyperparameters and the difficulty in selecting the optimal configuration. While Bayesian optimization methods have been successful in configuration selection, they are not designed to speed up the evaluation of configurations. Hyperband addresses this gap by formulating hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem, where resources are allocated adaptively to randomly sampled configurations. The algorithm, called HYPERBAND, uses a principled early-stopping strategy to allocate resources, allowing it to evaluate orders of magnitude more configurations than black-box procedures like Bayesian optimization methods. The paper provides a theoretical analysis of HYPERBAND, demonstrating its ability to adapt to unknown convergence rates and validation loss behavior. Empirical results show that HYPERBAND can provide over an order-of-magnitude speedup over popular Bayesian optimization methods on various deep-learning and kernel-based learning problems. The paper also discusses the application of HYPERBAND to different types of resources, such as time, data set subsampling, and feature subsampling, and provides guidelines for practical deployment.