TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

164. Chunking with Overlap

Easy

Split a list of tokens into fixed-size chunks with overlapping windows — a standard preprocessing step for RAG ingestion.

Signature: def chunk_text(tokens: list, chunk_size: int, overlap: int) -> list

  • Each chunk has at most chunk_size tokens.
  • Consecutive chunks share overlap tokens (so the step between chunks is chunk_size - overlap).
  • The final chunk may be shorter if the tokens don't divide evenly.
  • Return a list of token-list chunks.

Constraint: overlap < chunk_size.

Math

step=chunk_size−overlap,chunki​=tokens[i⋅step:i⋅step+chunk_size]

Asked at

NumPy

import numpy as np

 

def chunk_text(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?