Top-p (nucleus) sampling selects the smallest set of tokens whose cumulative probability exceeds threshold p, then samples from them. Unlike top-k (fixed count), top-p adapts: it picks fewer tokens when the model is confident and more when uncertain.
Steps:
Signature: def top_p_sampling(logits, p)
logits: (vocab_size,) — unnormalized scoresp: float — cumulative probability threshold (0 < p ≤ 1)(vocab_size,) — renormalized probability distributionMath
Asked at
Test Results