Numerical Gradient Check

Implement gradient verification using central finite differences. This is the single most important debugging technique in ML — when you implement a custom backward pass, you should always compare it to a numerical estimate to confirm correctness.

Signature: def gradient_check(f, x, analytic_grad, h=1e-5, tol=1e-7) -> bool

Method: For each component i, perturb only that component:

f_plus = f(x + h·eᵢ) (perturb up)
f_minus = f(x − h·eᵢ) (perturb down)
numerical_grad[i] = (f_plus − f_minus) / (2h)

Return True iff every component of analytic_grad is within tol of the corresponding numerical estimate.

Why central differences? Forward differences (f(x+h) − f(x))/h have O(h) error; central differences have O(h²) error. The extra function evaluation is worth it.

Pitfall: Don't accidentally mutate x in-place when perturbing — copy it first.

Math

Asked at