[slides and audio] How to Construct Deep Recurrent Neural Networks

This paper explores the extension of recurrent neural networks (RNNs) to deep RNNs, addressing the ambiguity in defining depth within RNNs compared to feedforward neural networks. The authors identify three key components of an RNN—input-to-hidden function, hidden-to-hidden transition, and hidden-to-output function—and propose two novel architectures for deep RNNs: Deep Transition (DT) RNN and Deep Output, Deep Transition (DOT) RNN. These architectures are evaluated on polyphonic music prediction and language modeling tasks, demonstrating superior performance over conventional shallow RNNs. The paper also introduces a neural operator framework to interpret these deep RNNs, providing insights into their regularizability and training challenges. Experimental results show that the proposed deep RNNs benefit from increased depth, outperforming shallow RNNs in both tasks.This paper explores the extension of recurrent neural networks (RNNs) to deep RNNs, addressing the ambiguity in defining depth within RNNs compared to feedforward neural networks. The authors identify three key components of an RNN—input-to-hidden function, hidden-to-hidden transition, and hidden-to-output function—and propose two novel architectures for deep RNNs: Deep Transition (DT) RNN and Deep Output, Deep Transition (DOT) RNN. These architectures are evaluated on polyphonic music prediction and language modeling tasks, demonstrating superior performance over conventional shallow RNNs. The paper also introduces a neural operator framework to interpret these deep RNNs, providing insights into their regularizability and training challenges. Experimental results show that the proposed deep RNNs benefit from increased depth, outperforming shallow RNNs in both tasks.

How to Construct Deep Recurrent Neural Networks

24 Apr 2014 | Razvan Pascanu1, Caglar Gulcehre1, Kyunghyun Cho2, and Yoshua Bengio1