[slides] The CMA Evolution Strategy%3A A Tutorial

The CMA Evolution Strategy (CMA-ES) is a stochastic optimization method for continuous domains, particularly effective for non-linear and non-convex functions. This tutorial introduces the CMA-ES, focusing on its key concepts and implementation details. **Preliminaries:** - **Eigendecomposition of a Positive Definite Matrix:** A symmetric, positive definite matrix $ C $ has an orthonormal basis of eigenvectors and eigenvalues. The eigendecomposition is unique, and the inverse and square root of $ C $ can be computed using these eigenvalues and eigenvectors. - **Multivariate Normal Distribution:** A multivariate normal distribution is characterized by its mean $ \boldsymbol{m} $ and covariance matrix $ \boldsymbol{C} $. The covariance matrix can be interpreted geometrically as an ellipsoid, with the principal axes corresponding to the eigenvectors and lengths to the eigenvalues. - **Randomized Black Box Optimization:** In black box optimization, the goal is to minimize an objective function $ f $ without knowing its exact form. The CMA-ES uses a multivariate normal distribution to generate new search points. **Basic Equation: Sampling:** - New search points are generated by sampling from a multivariate normal distribution with mean $ \boldsymbol{m} $ and covariance matrix $ \boldsymbol{C} $. **Selection and Recombination: Moving the Mean:** - The new mean $ \boldsymbol{m}^{(g+1)} $ is calculated as a weighted average of the best $ \mu $ individuals from the current population, using recombination weights $ w_i $. **Adapting the Covariance Matrix:** - **Estimating the Covariance Matrix From Scratch:** The empirical covariance matrix $ \boldsymbol{C}_{\text{emp}}^{(g+1)} $ is an unbiased estimator of the original covariance matrix $ \boldsymbol{C}^{(g)} $. A better estimate is obtained by selecting the best $ \mu $ points. - **Rank-$\mu$-Update:** This method updates the covariance matrix using the sum of outer products of selected steps from the current and previous generations. The learning rate $ c_\mu $ is crucial for balancing the influence of past and recent information. - **Rank-One-Update:** This method updates the covariance matrix using a single selected step, represented by an evolution path. The evolution path captures the direction and sign of the selected steps, enhancing the adaptation process. **Discussion:** - The choice of parameters, such as the learning rates $ c_m $ and $ c_\mu $, is critical for the performance of the CMA-ES. Optimal settings are often problem-dependent but generally follow certain guidelines, such as $ c_\mu \approx \mu_{\text{eff}} / n^2 $.The CMA Evolution Strategy (CMA-ES) is a stochastic optimization method for continuous domains, particularly effective for non-linear and non-convex functions. This tutorial introduces the CMA-ES, focusing on its key concepts and implementation details. **Preliminaries:** - **Eigendecomposition of a Positive Definite Matrix:** A symmetric, positive definite matrix $ C $ has an orthonormal basis of eigenvectors and eigenvalues. The eigendecomposition is unique, and the inverse and square root of $ C $ can be computed using these eigenvalues and eigenvectors. - **Multivariate Normal Distribution:** A multivariate normal distribution is characterized by its mean $ \boldsymbol{m} $ and covariance matrix $ \boldsymbol{C} $. The covariance matrix can be interpreted geometrically as an ellipsoid, with the principal axes corresponding to the eigenvectors and lengths to the eigenvalues. - **Randomized Black Box Optimization:** In black box optimization, the goal is to minimize an objective function $ f $ without knowing its exact form. The CMA-ES uses a multivariate normal distribution to generate new search points. **Basic Equation: Sampling:** - New search points are generated by sampling from a multivariate normal distribution with mean $ \boldsymbol{m} $ and covariance matrix $ \boldsymbol{C} $. **Selection and Recombination: Moving the Mean:** - The new mean $ \boldsymbol{m}^{(g+1)} $ is calculated as a weighted average of the best $ \mu $ individuals from the current population, using recombination weights $ w_i $. **Adapting the Covariance Matrix:** - **Estimating the Covariance Matrix From Scratch:** The empirical covariance matrix $ \boldsymbol{C}_{\text{emp}}^{(g+1)} $ is an unbiased estimator of the original covariance matrix $ \boldsymbol{C}^{(g)} $. A better estimate is obtained by selecting the best $ \mu $ points. - **Rank-$\mu$-Update:** This method updates the covariance matrix using the sum of outer products of selected steps from the current and previous generations. The learning rate $ c_\mu $ is crucial for balancing the influence of past and recent information. - **Rank-One-Update:** This method updates the covariance matrix using a single selected step, represented by an evolution path. The evolution path captures the direction and sign of the selected steps, enhancing the adaptation process. **Discussion:** - The choice of parameters, such as the learning rates $ c_m $ and $ c_\mu $, is critical for the performance of the CMA-ES. Optimal settings are often problem-dependent but generally follow certain guidelines, such as $ c_\mu \approx \mu_{\text{eff}} / n^2 $.

The CMA Evolution Strategy: A Tutorial

10 Mar 2023 | Nikolaus Hansen