TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

160. BM25 Scoring

Medium

Implement the BM25 ranking function used widely in lexical retrieval (e.g., Elasticsearch, Lucene).

Signature: def bm25_score(query_terms: list, doc_terms: list, idf: dict, avg_doc_len: float, k1: float = 1.5, b: float = 0.75) -> float

For each query term that appears in the document:

  • tf = number of times the term appears in the document
  • contribution = idf[term] * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * len(doc) / avg_doc_len))

Sum contributions across query terms. Skip query terms missing from idf.

Constraints: Use only Python builtins (math, collections allowed).

Math

BM25(q,d)=t∈q∑​idf(t)⋅f(t,d)+k1​(1−b+bavgdl∣d∣​)f(t,d)(k1​+1)​

Asked at

NumPy

import numpy as np

 

def bm25_score(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?