Hyperparameters and Tuning Strategies for Random Forest

Hyperparameters and Tuning Strategies for Random Forest

February 27, 2019 | Philipp Probst, Marvin Wright and Anne-Laure Boulesteix
The paper "Hyperparameters and Tuning Strategies for Random Forest" by Philipp Probst, Marvin Wright, and Anne-Laure Boulesteix reviews the impact of hyperparameters on the performance and variable importance measures of the random forest (RF) algorithm. It highlights that while RF often performs well with default hyperparameters, tuning these parameters can significantly improve its performance. The authors discuss various tuning strategies, including model-based optimization (MBO), and introduce the tuneRanger R package, which automates the tuning process. A benchmark study on multiple datasets compares the prediction performance and runtime of tuneRanger with other tuning implementations, showing that tuneRanger generally outperforms other methods, especially for larger datasets. The paper also provides detailed insights into the influence of specific hyperparameters, such as the number of randomly drawn candidate variables (*mtry*), sampling scheme, node size, and the number of trees, on both performance and variable importance.The paper "Hyperparameters and Tuning Strategies for Random Forest" by Philipp Probst, Marvin Wright, and Anne-Laure Boulesteix reviews the impact of hyperparameters on the performance and variable importance measures of the random forest (RF) algorithm. It highlights that while RF often performs well with default hyperparameters, tuning these parameters can significantly improve its performance. The authors discuss various tuning strategies, including model-based optimization (MBO), and introduce the tuneRanger R package, which automates the tuning process. A benchmark study on multiple datasets compares the prediction performance and runtime of tuneRanger with other tuning implementations, showing that tuneRanger generally outperforms other methods, especially for larger datasets. The paper also provides detailed insights into the influence of specific hyperparameters, such as the number of randomly drawn candidate variables (*mtry*), sampling scheme, node size, and the number of trees, on both performance and variable importance.
Reach us at info@study.space