TorchedUp
ProblemsPremium
TorchedUp
BM25 ScoringMedium
ProblemsPremium

BM25 Scoring

Implement the BM25 ranking function used widely in lexical retrieval (e.g., Elasticsearch, Lucene).

Signature: def bm25_score(query_terms: list, doc_terms: list, idf: dict, avg_doc_len: float, k1: float = 1.5, b: float = 0.75) -> float

For each query term that appears in the document:

  • tf = number of times the term appears in the document
  • contribution = idf[term] * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * len(doc) / avg_doc_len))

Sum contributions across query terms. Skip query terms missing from idf.

Constraints: Use only Python builtins (math, collections allowed).

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○single query term present
○missing terms skipped
○long doc penalty🔒 Premium
Advertisement