13 Jul 2018 | Richard Liaw*, Eric Liang*, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, Ion Stoica
The paper introduces Tune, a unified framework for distributed model selection and training. Tune addresses the computational demands of modern machine learning algorithms by providing a narrow-waist interface between training scripts and search algorithms. This interface simplifies the implementation of various hyperparameter search algorithms, allows for straightforward scaling to large clusters, and enhances algorithm reproducibility. The authors demonstrate the implementation of several state-of-the-art hyperparameter search algorithms in Tune, highlighting its flexibility and ease of use. Tune is built on the Ray distributed computing framework, which supports flexible task and actor abstractions, enabling efficient scheduling of irregular computations and resource management. The paper also discusses the requirements for API generality, including handling irregular computations, resource requirements, and intermediate trial results. Tune offers both a user API for model training and a scheduling API for improving the model search process, making it accessible to both end-users and researchers. The implementation of Tune is detailed, including its integration with Ray and the ability to handle data input and distributed computation. The paper concludes with a discussion on the design of Tune and its potential for future enhancements in tuning, analysis, and debugging.The paper introduces Tune, a unified framework for distributed model selection and training. Tune addresses the computational demands of modern machine learning algorithms by providing a narrow-waist interface between training scripts and search algorithms. This interface simplifies the implementation of various hyperparameter search algorithms, allows for straightforward scaling to large clusters, and enhances algorithm reproducibility. The authors demonstrate the implementation of several state-of-the-art hyperparameter search algorithms in Tune, highlighting its flexibility and ease of use. Tune is built on the Ray distributed computing framework, which supports flexible task and actor abstractions, enabling efficient scheduling of irregular computations and resource management. The paper also discusses the requirements for API generality, including handling irregular computations, resource requirements, and intermediate trial results. Tune offers both a user API for model training and a scheduling API for improving the model search process, making it accessible to both end-users and researchers. The implementation of Tune is detailed, including its integration with Ray and the ability to handle data input and distributed computation. The paper concludes with a discussion on the design of Tune and its potential for future enhancements in tuning, analysis, and debugging.