Backprop: 2-Layer MLP

Implement the backward pass for a 2-layer MLP with ReLU activation.

Architecture: z1 = x@W1+b1; a1 = relu(z1); z2 = a1@W2+b2

Signature: def backprop_mlp(x, W1, b1, W2, b2, dL_dz2) -> tuple

Return (dW1, db1, dW2, db2).

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○simple 2->2->1

○relu masks gradient

○batch size 2🔒 Premium

○gradient matches central-difference numerical estimate