TorchedUp
ProblemsPremium
TorchedUp
Chunking with OverlapEasy
ProblemsPremium

Chunking with Overlap

Split a list of tokens into fixed-size chunks with overlapping windows — a standard preprocessing step for RAG ingestion.

Signature: def chunk_text(tokens: list, chunk_size: int, overlap: int) -> list

  • Each chunk has at most chunk_size tokens.
  • Consecutive chunks share overlap tokens (so the step between chunks is chunk_size - overlap).
  • The final chunk may be shorter if the tokens don't divide evenly.
  • Return a list of token-list chunks.

Constraint: overlap < chunk_size.

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○overlap of 1
○no overlap
○larger overlap🔒 Premium
Advertisement