When you've written a custom CUDA kernel or a Triton op, autograd doesn't know how to differentiate it for you — you write the backward by hand. This series walks through every standard layer's backward derivation and verifies your work via finite-difference gradcheck. No shortcuts: you derive, you implement, you debug.
12 problems · suggested order