The CMA Evolution Strategy: A Tutorial

The CMA Evolution Strategy: A Tutorial

10 Mar 2023 | Nikolaus Hansen
The CMA Evolution Strategy (CMA-ES) is a stochastic optimization method for continuous domains, particularly effective for non-linear and non-convex functions. This tutorial introduces the CMA-ES, focusing on its key concepts and implementation details. **Preliminaries:** - **Eigendecomposition of a Positive Definite Matrix:** A symmetric, positive definite matrix \( C \) has an orthonormal basis of eigenvectors and eigenvalues. The eigendecomposition is unique, and the inverse and square root of \( C \) can be computed using these eigenvalues and eigenvectors. - **Multivariate Normal Distribution:** A multivariate normal distribution is characterized by its mean \( \boldsymbol{m} \) and covariance matrix \( \boldsymbol{C} \). The covariance matrix can be interpreted geometrically as an ellipsoid, with the principal axes corresponding to the eigenvectors and lengths to the eigenvalues. - **Randomized Black Box Optimization:** In black box optimization, the goal is to minimize an objective function \( f \) without knowing its exact form. The CMA-ES uses a multivariate normal distribution to generate new search points. **Basic Equation: Sampling:** - New search points are generated by sampling from a multivariate normal distribution with mean \( \boldsymbol{m} \) and covariance matrix \( \boldsymbol{C} \). **Selection and Recombination: Moving the Mean:** - The new mean \( \boldsymbol{m}^{(g+1)} \) is calculated as a weighted average of the best \( \mu \) individuals from the current population, using recombination weights \( w_i \). **Adapting the Covariance Matrix:** - **Estimating the Covariance Matrix From Scratch:** The empirical covariance matrix \( \boldsymbol{C}_{\text{emp}}^{(g+1)} \) is an unbiased estimator of the original covariance matrix \( \boldsymbol{C}^{(g)} \). A better estimate is obtained by selecting the best \( \mu \) points. - **Rank-$\mu$-Update:** This method updates the covariance matrix using the sum of outer products of selected steps from the current and previous generations. The learning rate \( c_\mu \) is crucial for balancing the influence of past and recent information. - **Rank-One-Update:** This method updates the covariance matrix using a single selected step, represented by an evolution path. The evolution path captures the direction and sign of the selected steps, enhancing the adaptation process. **Discussion:** - The choice of parameters, such as the learning rates \( c_m \) and \( c_\mu \), is critical for the performance of the CMA-ES. Optimal settings are often problem-dependent but generally follow certain guidelines, such as \( c_\mu \approx \mu_{\text{eff}} / n^2 \).The CMA Evolution Strategy (CMA-ES) is a stochastic optimization method for continuous domains, particularly effective for non-linear and non-convex functions. This tutorial introduces the CMA-ES, focusing on its key concepts and implementation details. **Preliminaries:** - **Eigendecomposition of a Positive Definite Matrix:** A symmetric, positive definite matrix \( C \) has an orthonormal basis of eigenvectors and eigenvalues. The eigendecomposition is unique, and the inverse and square root of \( C \) can be computed using these eigenvalues and eigenvectors. - **Multivariate Normal Distribution:** A multivariate normal distribution is characterized by its mean \( \boldsymbol{m} \) and covariance matrix \( \boldsymbol{C} \). The covariance matrix can be interpreted geometrically as an ellipsoid, with the principal axes corresponding to the eigenvectors and lengths to the eigenvalues. - **Randomized Black Box Optimization:** In black box optimization, the goal is to minimize an objective function \( f \) without knowing its exact form. The CMA-ES uses a multivariate normal distribution to generate new search points. **Basic Equation: Sampling:** - New search points are generated by sampling from a multivariate normal distribution with mean \( \boldsymbol{m} \) and covariance matrix \( \boldsymbol{C} \). **Selection and Recombination: Moving the Mean:** - The new mean \( \boldsymbol{m}^{(g+1)} \) is calculated as a weighted average of the best \( \mu \) individuals from the current population, using recombination weights \( w_i \). **Adapting the Covariance Matrix:** - **Estimating the Covariance Matrix From Scratch:** The empirical covariance matrix \( \boldsymbol{C}_{\text{emp}}^{(g+1)} \) is an unbiased estimator of the original covariance matrix \( \boldsymbol{C}^{(g)} \). A better estimate is obtained by selecting the best \( \mu \) points. - **Rank-$\mu$-Update:** This method updates the covariance matrix using the sum of outer products of selected steps from the current and previous generations. The learning rate \( c_\mu \) is crucial for balancing the influence of past and recent information. - **Rank-One-Update:** This method updates the covariance matrix using a single selected step, represented by an evolution path. The evolution path captures the direction and sign of the selected steps, enhancing the adaptation process. **Discussion:** - The choice of parameters, such as the learning rates \( c_m \) and \( c_\mu \), is critical for the performance of the CMA-ES. Optimal settings are often problem-dependent but generally follow certain guidelines, such as \( c_\mu \approx \mu_{\text{eff}} / n^2 \).
Reach us at info@study.space