Implement gradient verification using central finite differences. This is the single most important debugging technique in ML — when you implement a custom backward pass, you should always compare it to a numerical estimate to confirm correctness.
Signature: def gradient_check(f, x, analytic_grad, h=1e-5, tol=1e-7) -> bool
Method: For each component i, perturb only that component:
f_plus = f(x + h·eᵢ) (perturb up)f_minus = f(x − h·eᵢ) (perturb down)numerical_grad[i] = (f_plus − f_minus) / (2h)Return True iff every component of analytic_grad is within tol of the corresponding numerical estimate.
Why central differences? Forward differences (f(x+h) − f(x))/h have O(h) error; central differences have O(h²) error. The extra function evaluation is worth it.
Pitfall: Don't accidentally mutate x in-place when perturbing — copy it first.
Math
Asked at
Test Results