Given the wall-clock time the GPU spends per step and the time it takes one dataloader worker to produce one batch, compute the milliseconds the GPU is idle per step waiting on data.
Signature: def compute_throughput_gap(gpu_step_ms: float, dataloader_batch_ms: float, num_workers: int) -> float
With num_workers parallel workers the effective per-batch dataloader time is dataloader_batch_ms / num_workers. The wasted time per step is max(0, effective - gpu_step_ms).
Math
Asked at
import numpy as np
def compute_throughput_gap(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?