Bidirectional RNN

Bidirectional RNNs process sequences in both directions — forward (left-to-right) and backward (right-to-left) — then concatenate hidden states at each position. This gives every output access to both past and future context, a key ingredient in BERT-style encoders.

Each direction uses a standard RNN step:

h_t = tanh(W @ concat([h_{t-1}, x_t]) + b)

The forward direction processes x_0, x_1, ..., x_{T-1} and the backward direction processes x_{T-1}, ..., x_1, x_0, each from their respective initial states.

Signature: def bidirectional_rnn(xs, h0_fwd, h0_bwd, W_fwd, W_bwd, b_fwd, b_bwd)

xs: (T, input_size)
h0_fwd, h0_bwd: (hidden_size,) — initial states
W_fwd, W_bwd: (hidden_size, hidden_size + input_size) — combined [W_h | W_x]
b_fwd, b_bwd: (hidden_size,)
Returns: (T, 2·hidden_size) — concat([h_fwd_t, h_bwd_t]) at each timestep

Math

Asked at