TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

150. LoRA Backward

Hard

The LoRA branch routes x through a down-projection A and up-projection B with a constant alpha/r scale (see the math reference). Compute the gradients of the loss w.r.t. A and B (the only trainable matrices — W is frozen).

Signature: def lora_backward(x: np.ndarray, A: np.ndarray, B: np.ndarray, dL_dy: np.ndarray, alpha: float, r: int) -> tuple

Shapes:

  • x: (batch, in)
  • A: (r, in)
  • B: (out, r)
  • dL_dy: (batch, out) — upstream gradient w.r.t. the LoRA output

Returns: (dA, dB) with shapes (r, in) and (out, r) (each gradient has the same shape as its parameter).

Math

∂B∂L​=rα​∂y∂L​⊤h,∂A∂L​=rα​(∂y∂L​B)⊤x

Asked at

NumPy

import numpy as np

 

def lora_backward(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?