TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
Learn/Backpropagation Series
∂

Backpropagation Series

When you've written a custom CUDA kernel or a Triton op, autograd doesn't know how to differentiate it for you — you write the backward by hand. This series walks through every standard layer's backward derivation and verifies your work via finite-difference gradcheck. No shortcuts: you derive, you implement, you debug.

12 problems · suggested order

  1. ○1#200Backprop: Linear (matmul + bias)easy
  2. ○2#201Backprop: ReLUeasy
  3. ○3#202Backprop: Sigmoideasy
  4. ○4#203Backprop: Tanheasy
  5. ○5#204Backprop: Softmaxmedium
  6. ○6#205Backprop: Softmax + Cross-Entropy (fused)medium
  7. ○7#208Backprop: Embedding lookupmedium
  8. ○8#211Backprop: RoPE rotationmedium
  9. ○9#206Backprop: LayerNormhard
  10. ○10#207Backprop: BatchNorm (train mode)hard
  11. ○11#209Backprop: Conv2d (via im2col)hard
  12. ○12#210Backprop: Attention headhard
Tracks are curated by hand. The order above is the suggested learning progression — feel free to skip around if you already know a topic.

© 2026 TorchedUp. All rights reserved.

ChangelogContact UsTerms of ServicePrivacy PolicyRefund Policy