The implementation in the editor is supposed to be a single-layer Transformer encoder block, but it has 5 bugs. Find and fix all of them.
Signature: def transformer_encoder_buggy(x, Wq, Wk, Wv, Wo, W1, b1, W2, b2, ln1_g, ln1_b, ln2_g, ln2_b, n_heads=2)
The block is expected to perform: multi-head scaled dot-product self-attention with output projection Wo, a residual connection and LayerNorm, a position-wise feed-forward network (Linear → ReLU → Linear), and a second residual + LayerNorm. The output shape is (S, d_model).
Treat the existing code as the contract template — keep the same overall structure, but make every step numerically correct and faithful to a standard pre-/post-LayerNorm encoder block.
Asked at
import numpy as np
def transformer_encoder_buggy(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?