TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

43. LSTM Cell

Medium

The Long Short-Term Memory (LSTM) cell uses gating mechanisms to control information flow, solving the vanishing gradient problem of vanilla RNNs. It maintains two states: the hidden state h (short-term memory) and the cell state c (long-term memory).

Given input x, previous hidden state h_prev, previous cell state c_prev, and concatenated weight matrices:

  • W_ih: (4·H, input_size) — input-to-hidden weights
  • W_hh: (4·H, H) — hidden-to-hidden weights
  • b_ih, b_hh: (4·H,) — biases

Slice the gate pre-activations in order [i, f, g, o] (each of size H):

i = sigmoid(gates[:H])     # input gate
f = sigmoid(gates[H:2H])   # forget gate
g = tanh(gates[2H:3H])     # cell gate
o = sigmoid(gates[3H:])    # output gate

Signature: def lstm_cell(x, h_prev, c_prev, W_ih, W_hh, b_ih, b_hh)

Returns: (h_next, c_next) — both shape (H,)

Math

gatesi,f,g,oct​ht​​=Wih​x+bih​+Whh​ht−1​+bhh​=split(gates, 4)=σ(f)⊙ct−1​+σ(i)⊙tanh(g)=σ(o)⊙tanh(ct​)​

Asked at

NumPy

import numpy as np

 

def lstm_cell(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?