The LoRA branch routes x through a down-projection A and up-projection B with a constant alpha/r scale (see the math reference). Compute the gradients of the loss w.r.t. A and B (the only trainable matrices — W is frozen).
Signature: def lora_backward(x: np.ndarray, A: np.ndarray, B: np.ndarray, dL_dy: np.ndarray, alpha: float, r: int) -> tuple
Shapes:
x: (batch, in)A: (r, in)B: (out, r)dL_dy: (batch, out) — upstream gradient w.r.t. the LoRA outputReturns: (dA, dB) with shapes (r, in) and (out, r) (each gradient has the same shape as its parameter).
Math
Asked at
import numpy as np
def lora_backward(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?