[slides] Parallel WaveNet%3A Fast High-Fidelity Speech Synthesis

The paper introduces a novel method called *Probability Density Distillation* to train a parallel feed-forward network from a trained WaveNet model, achieving high-fidelity speech synthesis at over 20 times faster speed than real-time. This method allows for efficient sampling while maintaining the quality of the original WaveNet, making it suitable for real-time production settings. The resulting system is deployed by Google Assistant, serving multiple English and Japanese voices. The paper details the original WaveNet model, the parallel WaveNet architecture, and the distillation process. Experimental results show no significant loss in quality compared to the original WaveNet and superior performance over previous benchmarks. The system has been successfully deployed in production at Google, serving millions of users.The paper introduces a novel method called *Probability Density Distillation* to train a parallel feed-forward network from a trained WaveNet model, achieving high-fidelity speech synthesis at over 20 times faster speed than real-time. This method allows for efficient sampling while maintaining the quality of the original WaveNet, making it suitable for real-time production settings. The resulting system is deployed by Google Assistant, serving multiple English and Japanese voices. The paper details the original WaveNet model, the parallel WaveNet architecture, and the distillation process. Experimental results show no significant loss in quality compared to the original WaveNet and superior performance over previous benchmarks. The system has been successfully deployed in production at Google, serving millions of users.

Parallel WaveNet: Fast High-Fidelity Speech Synthesis