TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

13. Backprop: 2-Layer MLP

Hard

Implement the backward pass for a 2-layer MLP with ReLU activation.

Architecture: z1 = x@W1+b1; a1 = relu(z1); z2 = a1@W2+b2

Signature: def backprop_mlp(x, W1, b1, W2, b2, dL_dz2) -> tuple

Return (dW1, db1, dW2, db2).

Math

δ(l)=(W(l+1))⊤δ(l+1)⊙σ′(z(l))

Asked at

Python 30/10 runs today

Output

Anything you print() in your code will show up here after you click Run.

Test Results

○simple 2->2->1
○relu masks gradient
○batch size 2🔒 Premium
○gradient matches central-difference numerical estimate
○batched 3D input (B=2, T=3, in=2 -> hidden=2 -> out=1)🔒 Premium