Apply the repetition penalty to a logits vector. For each token id that has appeared in prev_tokens:
logits[t] > 0: logits[t] /= penaltylogits[t] *= penaltyThis pushes already-seen tokens toward zero either way (when penalty > 1).
Signature: def apply_repetition_penalty(logits: np.ndarray, prev_tokens: list, penalty: float) -> np.ndarray
Return a new array; do not modify the input. Each unique token in prev_tokens is penalized exactly once (duplicate ids do not stack).
Math
Asked at
Test Results