Implement max pooling — the standard downsampling operation in CNNs. A sliding window takes the maximum value in each non-overlapping region, reducing spatial dimensions while retaining the strongest activations.
Forward pass: For each pooling window of size pool_size × pool_size, output the maximum value across all spatial positions (per channel). Also track which position held the max (the mask) — needed for the backward pass.
Backward pass: Gradients flow only to the position that was the max in the forward pass; all other positions get 0. The mask encodes this routing.
Signature: def maxpool2d(x, pool_size=2, stride=2)
x: (H, W, C) — input feature map, channels-lastpool_size: int — pooling window size (default 2)stride: int — step between windows (default 2)(output, mask) where:
output: (H_out, W_out, C) — max values, where H_out = (H - pool_size) // stride + 1mask: (H, W, C) boolean — True at the position that was max in each windowThe test harness checks the pooled output values. Your mask must correctly mark exactly one True per window per channel.
Math
Asked at
Test Results