Implement depthwise separable convolution as used in MobileNet. This factorizes a standard convolution into two cheaper operations:
C_in to C_out.This reduces computation by roughly a factor of k² (kernel size squared) compared to a full convolution.
Signature: def depthwise_separable_conv(x, dw_kernel, pw_kernel)
x: (H, W, C_in)dw_kernel: (kH, kW, C_in) — one spatial filter per input channel (no C_out axis)pw_kernel: (C_out, C_in) — 1×1 pointwise weights(H_out, W_out, C_out) where H_out = H - kH + 1Step 1 — depthwise: For each channel c and output position (i, j):
dw_out[i, j, c] = sum over (kh, kw) of x[i+kh, j+kw, c] * dw_kernel[kh, kw, c]
Step 2 — pointwise: Matrix-multiply spatial positions by pw_kernel:
out = dw_out.reshape(-1, C_in) @ pw_kernel.T → reshape to (H_out, W_out, C_out)
Math
Asked at
Test Results