Demonstrate batch normalization in training mode using torch.nn.BatchNorm1d.
Signature: def batchnorm_train_forward(x)
x: training batch (N, C) as nested list(N, C)Steps:
torch.nn.BatchNorm1d(num_features=C) with default weight=1, bias=0bn.train() to set training modex through the BatchNorm layer.detach().tolist()BatchNorm in train mode normalizes each feature across the batch:
y = (x - batch_mean) / sqrt(batch_var + eps)
where batch_mean and batch_var are computed per-feature over the batch dimension. With default weight=1 and bias=0, the output has approximately zero mean and unit variance per feature.
Key insight: In eval mode, BatchNorm uses running statistics accumulated during training rather than batch statistics — this is why train vs eval mode matters.
Math
Asked at
Test Results