Neural Adaptive Video Streaming with Pensieve
Hongzi Mao
Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, June 2017.
Abstract
Client-side video players use bitrate adaptation algorithms to meet user quality of experience (QoE) requirements. These ABR algorithms balance multiple QoE factors, such as maximizing video bitrate and minimizing rebuffering times. Despite the abundance of recently proposed ABR algorithms, state-of-the-art schemes face two practical challenges: (1) throughput prediction is difficult and inaccurate predictions can lead to degraded performance; (2) existing algorithms use fixed heuristics which have been fine-tuned according to strict assumptions about deployment environments — such tuning precludes generalization across network conditions and QoE objectives.
To overcome these challenges, we develop Pensieve, a system that generates ABR algorithms entirely using Reinforcement Learning (RL). Pensieve uses RL to train a neural network model that selects bitrates for future video chunks based on observations collected by client video players. Unlike existing approaches, Pensieve does not rely upon pre-programmed models or assumptions about the environment. Instead, it learns to make ABR decisions solely through observations of the resulting performance of past decisions. As a result, Pensieve can automatically learn ABR algorithms that adapt to a wide range of environmental conditions and QoE metrics. We compare Pensieve to state-of-the-art ABR algorithms using trace-driven and real-world experiments spanning a wide variety of network conditions, QoE metrics, and video properties. In all considered scenarios, Pensieve outperforms the best state-of-the-art scheme, with improvements in average QoE of 13.1%–25.0%. Pensieve's policies generalize well, outperforming existing schemes even on networks on which it was not trained.
Thesis Supervisor: Mohammad Alizadeh
Title: Assistant Professor of Electrical Engineering and Computer Science
Acknowledgments
This research was performed under the supervision of Professor Mohammad Alizadeh, and it was published in SIGCOMM 2017. My first year of graduate studies was generously supported by the Andrew (1956) and Erna Viterbi Fellowship. In addition, this work was partially supported by the National Science Foundation.
I would like to thank Mohammad for epitomizing what an ideal advisor, mentor, researcher and friend would be. I am truly blessed to have him open my eyes to the landscape of research problems, guide me towards a uniquely interesting path and support me perpetually through all difficulties. It fascinates me every time when I think about all the adventures we are going to take in the upcoming years.
I also want to thank Professor Dina Katabi and Professor Fadel Adib for introducing me to the wizard world of MIT. They taught me the hardNeural Adaptive Video Streaming with Pensieve
Hongzi Mao
Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, June 2017.
Abstract
Client-side video players use bitrate adaptation algorithms to meet user quality of experience (QoE) requirements. These ABR algorithms balance multiple QoE factors, such as maximizing video bitrate and minimizing rebuffering times. Despite the abundance of recently proposed ABR algorithms, state-of-the-art schemes face two practical challenges: (1) throughput prediction is difficult and inaccurate predictions can lead to degraded performance; (2) existing algorithms use fixed heuristics which have been fine-tuned according to strict assumptions about deployment environments — such tuning precludes generalization across network conditions and QoE objectives.
To overcome these challenges, we develop Pensieve, a system that generates ABR algorithms entirely using Reinforcement Learning (RL). Pensieve uses RL to train a neural network model that selects bitrates for future video chunks based on observations collected by client video players. Unlike existing approaches, Pensieve does not rely upon pre-programmed models or assumptions about the environment. Instead, it learns to make ABR decisions solely through observations of the resulting performance of past decisions. As a result, Pensieve can automatically learn ABR algorithms that adapt to a wide range of environmental conditions and QoE metrics. We compare Pensieve to state-of-the-art ABR algorithms using trace-driven and real-world experiments spanning a wide variety of network conditions, QoE metrics, and video properties. In all considered scenarios, Pensieve outperforms the best state-of-the-art scheme, with improvements in average QoE of 13.1%–25.0%. Pensieve's policies generalize well, outperforming existing schemes even on networks on which it was not trained.
Thesis Supervisor: Mohammad Alizadeh
Title: Assistant Professor of Electrical Engineering and Computer Science
Acknowledgments
This research was performed under the supervision of Professor Mohammad Alizadeh, and it was published in SIGCOMM 2017. My first year of graduate studies was generously supported by the Andrew (1956) and Erna Viterbi Fellowship. In addition, this work was partially supported by the National Science Foundation.
I would like to thank Mohammad for epitomizing what an ideal advisor, mentor, researcher and friend would be. I am truly blessed to have him open my eyes to the landscape of research problems, guide me towards a uniquely interesting path and support me perpetually through all difficulties. It fascinates me every time when I think about all the adventures we are going to take in the upcoming years.
I also want to thank Professor Dina Katabi and Professor Fadel Adib for introducing me to the wizard world of MIT. They taught me the hard