In continuous (in-flight) batching, every decode step we pick the top-priority active sequences to run. Implement the scheduler.
Signature: def schedule_step(active_seqs: list, max_batch: int) -> list
active_seqs is a list of dicts {'id': int, 'remaining': int, 'priority': float}. Return the list of sequence IDs to run this step — top min(max_batch, len(active_seqs)) by descending priority.
Tie-breaking: stable (preserve input order on equal priority).
Example:
[{'id': 1, 'remaining': 5, 'priority': 0.5}, {'id': 2, 'remaining': 3, 'priority': 0.9}, {'id': 3, 'remaining': 10, 'priority': 0.1}], max_batch=2 → [2, 1]Math
Asked at
Test Results