Given n total samples drawn from a model and c of them correct, compute the unbiased estimator of pass@k — the probability that at least one of k random samples is correct.
Signature: def pass_at_k(n: int, c: int, k: int) -> float
This is the standard estimator from the HumanEval / Codex paper: 1 minus the probability that a random size-k subset of the n samples contains no correct example. See the math reference below for the closed form. Implement it in a way that stays numerically stable for large n (binomial coefficients overflow if computed directly), and handle the degenerate case where there are fewer incorrect samples than k.
Math
Asked at
import numpy as np
def pass_at_k(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?