What is Vanilla RNNs?


Vanilla RNNs Gradient Flow

For computation efficiency, we simplify calculation with:


Problem with Vanilla RNNs Gradient Flow

Repeated and in backpropagation

When doing backpropagation, we can observe from the computational graph that we’ll need to repeatedly multiply the gradient by and

Problem with

Problem with

Largest singular value > 1 Exploding Gradients Largest singular value < 1 Vanishing Gradients

Solution

We give up Vanilla RNNs and use LSTM instead