TorchedUp
ProblemsPremium
TorchedUp
Depthwise Separable ConvolutionMedium
ProblemsPremium

Depthwise Separable Convolution

Implement depthwise separable convolution as used in MobileNet. This factorizes a standard convolution into two cheaper operations:

  1. Depthwise convolution: apply one filter per input channel (no cross-channel mixing). Each channel is convolved independently.
  2. Pointwise convolution: a 1×1 convolution that mixes channels, projecting from C_in to C_out.

This reduces computation by roughly a factor of k² (kernel size squared) compared to a full convolution.

Signature: def depthwise_separable_conv(x, dw_kernel, pw_kernel)

  • x: (H, W, C_in)
  • dw_kernel: (kH, kW, C_in) — one spatial filter per input channel (no C_out axis)
  • pw_kernel: (C_out, C_in) — 1×1 pointwise weights
  • Returns: (H_out, W_out, C_out) where H_out = H - kH + 1

Step 1 — depthwise: For each channel c and output position (i, j):

dw_out[i, j, c] = sum over (kh, kw) of x[i+kh, j+kw, c] * dw_kernel[kh, kw, c]

Step 2 — pointwise: Matrix-multiply spatial positions by pw_kernel:

out = dw_out.reshape(-1, C_in) @ pw_kernel.T  →  reshape to (H_out, W_out, C_out)

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○delta dw kernel + identity pw: extracts center values (should equal center of each 3x3 window)
○all-ones 3x3 dw kernel sums the entire 3x3 patch
○random 4x4x2 input, 3x3 dw kernel, 3 output channels (seed=42)
○two-channel input, 1x1 dw kernel, pointwise mixes channels🔒 Premium
Advertisement