Implement the layer normalization forward pass in PyTorch using primitive tensor ops only.
Signature: def layer_norm(x: torch.Tensor, gamma: torch.Tensor, beta: torch.Tensor, eps: float = 1e-5) -> torch.Tensor
The rule: you may NOT call nn.LayerNorm, F.layer_norm, or any built-in normalization layer. Roll the math yourself with .mean(), .var(), and friends.
Normalize over the last (feature) axis for each sample independently, then apply the learnable affine gamma * x_hat + beta.
PyTorch idioms vs NumPy:
keepdim=True (no 's'). NumPy uses keepdims=True. Wrong spelling silently falls through to keepdim=False and your shapes won't broadcast.dim=-1 not axis=-1.x.var(dim=-1, keepdim=True, unbiased=False) matches NumPy's np.var (population variance, /N). Default unbiased=True divides by N-1 and gives slightly different numbers — match LayerNorm's standard convention with unbiased=False.Math
Related problems
Asked at
import numpy as np
def layer_norm(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?