This paper evaluates three variants of the Gated Recurrent Unit (GRU) in recurrent neural networks (RNN) by reducing parameters in the update and reset gates. The three variant GRU models are evaluated on MNIST and IMDB datasets, showing that these GRU-RNN variant models perform as well as the original GRU RNN model while reducing the computational expense.
GRUs are a type of RNN that uses gating mechanisms to control the flow of information. They have fewer gates than Long Short-Term Memory (LSTM) RNNs, which reduces the number of parameters and computational cost. The paper explores three new gate-variants of GRU with reduced parameterization. These variants are evaluated on two public datasets: MNIST and IMDB.
For the MNIST dataset, pixel-wise and row-wise sequences are generated. The pixel-wise sequences are of length 784, while the row-wise sequences are of length 28. The GRU variants are evaluated on these sequences, and their performance is compared to the original GRU RNN. The results show that GRU1 and GRU2 perform almost as well as the original GRU RNN, while GRU3 performs less well for certain learning rates.
For the IMDB dataset, the GRU variants are evaluated on natural sequences. The results show that all three GRU variants perform comparably to the original GRU RNN while using fewer parameters. The GRU3 variant, which uses only bias in the gate signals, shows similar learning pace to the other variants at a constant base learning rate.
The paper concludes that the GRU variants reduce redundancy in the gate signals and thus perform comparably to the original GRU RNN. While GRU1 and GRU2 have performance comparable to the original GRU RNN, GRU3 frequently lags in performance, especially for longer sequences. The paper suggests that further experiments on diverse datasets are needed to validate the performance of these variants.This paper evaluates three variants of the Gated Recurrent Unit (GRU) in recurrent neural networks (RNN) by reducing parameters in the update and reset gates. The three variant GRU models are evaluated on MNIST and IMDB datasets, showing that these GRU-RNN variant models perform as well as the original GRU RNN model while reducing the computational expense.
GRUs are a type of RNN that uses gating mechanisms to control the flow of information. They have fewer gates than Long Short-Term Memory (LSTM) RNNs, which reduces the number of parameters and computational cost. The paper explores three new gate-variants of GRU with reduced parameterization. These variants are evaluated on two public datasets: MNIST and IMDB.
For the MNIST dataset, pixel-wise and row-wise sequences are generated. The pixel-wise sequences are of length 784, while the row-wise sequences are of length 28. The GRU variants are evaluated on these sequences, and their performance is compared to the original GRU RNN. The results show that GRU1 and GRU2 perform almost as well as the original GRU RNN, while GRU3 performs less well for certain learning rates.
For the IMDB dataset, the GRU variants are evaluated on natural sequences. The results show that all three GRU variants perform comparably to the original GRU RNN while using fewer parameters. The GRU3 variant, which uses only bias in the gate signals, shows similar learning pace to the other variants at a constant base learning rate.
The paper concludes that the GRU variants reduce redundancy in the gate signals and thus perform comparably to the original GRU RNN. While GRU1 and GRU2 have performance comparable to the original GRU RNN, GRU3 frequently lags in performance, especially for longer sequences. The paper suggests that further experiments on diverse datasets are needed to validate the performance of these variants.