TorchedUp
ProblemsPremium
TorchedUp
LSTM CellMedium
ProblemsPremium

LSTM Cell

The Long Short-Term Memory (LSTM) cell uses gating mechanisms to control information flow, solving the vanishing gradient problem of vanilla RNNs. It maintains two states: the hidden state h (short-term memory) and the cell state c (long-term memory).

Given input x, previous hidden state h_prev, previous cell state c_prev, and concatenated weight matrices:

  • W_ih: (4·H, input_size) — input-to-hidden weights
  • W_hh: (4·H, H) — hidden-to-hidden weights
  • b_ih, b_hh: (4·H,) — biases

Slice the gate pre-activations in order [i, f, g, o] (each of size H):

i = sigmoid(gates[:H])     # input gate
f = sigmoid(gates[H:2H])   # forget gate
g = tanh(gates[2H:3H])     # cell gate
o = sigmoid(gates[3H:])    # output gate

Signature: def lstm_cell(x, h_prev, c_prev, W_ih, W_hh, b_ih, b_hh)

Returns: (h_next, c_next) — both shape (H,)

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○basic forward pass
○forget gate passes cell state through
○non-zero hidden state🔒 Premium
Advertisement