PyTorch: BatchNorm Train vs Eval Mode

Demonstrate batch normalization in training mode using torch.nn.BatchNorm1d.

Signature: def batchnorm_train_forward(x)

x: training batch (N, C) as nested list
Returns: normalized output as nested list (N, C)

Steps:

Create torch.nn.BatchNorm1d(num_features=C) with default weight=1, bias=0
Call bn.train() to set training mode
Pass x through the BatchNorm layer
Return .detach().tolist()

BatchNorm in train mode normalizes each feature across the batch:

y = (x - batch_mean) / sqrt(batch_var + eps)

where batch_mean and batch_var are computed per-feature over the batch dimension. With default weight=1 and bias=0, the output has approximately zero mean and unit variance per feature.

Key insight: In eval mode, BatchNorm uses running statistics accumulated during training rather than batch statistics — this is why train vs eval mode matters.

Math

Asked at