TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

168. Token-level F1

Easy

Compute the token-overlap F1 between a predicted answer and a reference answer — the standard span-QA metric (SQuAD).

Signature: def token_f1(prediction: list, reference: list) -> float

  • Use Counter & Counter (multiset intersection) to count common tokens with multiplicity.
  • precision = common / len(prediction)
  • recall = common / len(reference)
  • F1 = 2 * p * r / (p + r)
  • Return 0.0 if there are no common tokens (or either input is empty).

Math

F1​=p+r2⋅p⋅r​,p=∣pred∣∣pred∩ref∣​, r=∣ref∣∣pred∩ref∣​

Asked at

NumPy

import numpy as np

 

def token_f1(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?