Split a list of tokens into fixed-size chunks with overlapping windows — a standard preprocessing step for RAG ingestion.
Signature: def chunk_text(tokens: list, chunk_size: int, overlap: int) -> list
chunk_size tokens.overlap tokens (so the step between chunks is chunk_size - overlap).Constraint: overlap < chunk_size.
Math
Asked at
Test Results