Hand-derive the gradient of L = sum(Conv2d(x)) w.r.t. the input x. No padding, stride 1.
Forward:
x has shape (C_in, H, W) (single image, no batch dim).K has shape (C_out, C_in, kH, kW).y has shape (C_out, H - kH + 1, W - kW + 1) and y[c_out, i, j] = sum_{c_in, di, dj} K[c_out, c_in, di, dj] * x[c_in, i+di, j+dj].Implement:
conv2d_forward(x, K) -> yconv2d_backward(x, K) -> dL/dx of shape (C_in, H, W)The backward of a valid cross-correlation w.r.t. the input is a full convolution of the upstream gradient with the filter (flipped in spatial dims). With L = sum(y) so dL/dy = ones(C_out, H_out, W_out),
dL/dx[c_in, i, j] = sum over (c_out, di, dj) of K[c_out, c_in, di, dj]
for every (di, dj) such that the output position (i - di, j - dj) is in range.
For an interior pixel (far from any edge), every (di, dj) contributes, so dL/dx[c_in, i, j] = sum_{c_out} sum_{di, dj} K[c_out, c_in, di, dj] — a constant. Edge pixels get a smaller subset.
Math
Asked at
import numpy as np
def conv2d_forward(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?