TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

9. Adam Optimizer Step

Medium

Implement a single Adam optimizer parameter update.

Signature: def adam_step(theta: np.ndarray, grad: np.ndarray, m: np.ndarray, v: np.ndarray, t: int, lr: float = 0.01, beta1: float = 0.9, beta2: float = 0.999, eps: float = 1e-8) -> np.ndarray

Return the updated theta_new after one step.

  • Update biased first moment: m = beta1*m + (1-beta1)*grad
  • Update biased second moment: v = beta2*v + (1-beta2)*grad^2
  • Bias correction: m_hat = m/(1-beta1^t), v_hat = v/(1-beta2^t)
  • Parameter update: theta -= lr * m_hat / (sqrt(v_hat) + eps)

Math

mt​=β1​mt−1​+(1−β1​)g,vt​=β2​vt−1​+(1−β2​)g2

Asked at

Python 30/10 runs today

Output

Anything you print() in your code will show up here after you click Run.

Test Results

○first step
○multi-param first step
○second step with momentum🔒 Premium
○2D (N, D) parameter matrix — first step🔒 Premium