Implement the LoRA (Low-Rank Adaptation) forward pass. Instead of training the full weight matrix W, LoRA freezes W and learns a low-rank update B @ A (where A and B are skinny matrices of rank r).
Signature: def lora_forward(x: np.ndarray, W: np.ndarray, A: np.ndarray, B: np.ndarray, alpha: float, r: int) -> np.ndarray
Shapes:
x: (batch, in)W: (out, in) — frozen base weightA: (r, in) — down-projectionB: (out, r) — up-projectionReturns: x @ W.T + (alpha / r) * (x @ A.T @ B.T) — shape (batch, out).
Math
Asked at
Test Results