This chapter, authored by Guy Shani and Asele Gunawardana, focuses on the evaluation of recommendation systems. It highlights the importance of selecting an appropriate algorithm for a recommendation system, considering various properties such as accuracy, robustness, and scalability. The authors discuss three types of experiments: offline settings, user studies, and large-scale online experiments. Each type of experiment is described in detail, along with the questions they can answer and the protocols for conducting them. The chapter also reviews a wide range of properties and evaluation metrics to help evaluate systems effectively. Initially, most recommenders were evaluated based on their prediction power, but it is now recognized that other factors, such as user experience and system performance, are equally important. The authors emphasize the need to identify relevant properties for specific applications and provide guidance on how to conduct and interpret experiments to draw reliable conclusions.This chapter, authored by Guy Shani and Asele Gunawardana, focuses on the evaluation of recommendation systems. It highlights the importance of selecting an appropriate algorithm for a recommendation system, considering various properties such as accuracy, robustness, and scalability. The authors discuss three types of experiments: offline settings, user studies, and large-scale online experiments. Each type of experiment is described in detail, along with the questions they can answer and the protocols for conducting them. The chapter also reviews a wide range of properties and evaluation metrics to help evaluate systems effectively. Initially, most recommenders were evaluated based on their prediction power, but it is now recognized that other factors, such as user experience and system performance, are equally important. The authors emphasize the need to identify relevant properties for specific applications and provide guidance on how to conduct and interpret experiments to draw reliable conclusions.