Top-k sampling restricts token selection to the k most probable tokens, then samples from their renormalized distribution. This prevents sampling from very unlikely tokens while maintaining diversity (unlike greedy decoding).
Steps:
Signature: def top_k_sampling(logits, k)
logits: (vocab_size,) — unnormalized scoresk: int — number of top tokens to keep(vocab_size,) — renormalized probability distribution (zeros for non-top-k)Math
Asked at
Test Results