TorchedUp
ProblemsPremium
TorchedUp
Dataloader Bottleneck GapMedium
ProblemsPremium

Dataloader Bottleneck

Given the wall-clock time the GPU spends per step and the time it takes one dataloader worker to produce one batch, compute the milliseconds the GPU is idle per step waiting on data.

Signature: def compute_throughput_gap(gpu_step_ms: float, dataloader_batch_ms: float, num_workers: int) -> float

With num_workers parallel workers the effective per-batch dataloader time is dataloader_batch_ms / num_workers. The wasted time per step is max(0, effective - gpu_step_ms).

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○gpu bound
○data bound
○just balanced🔒 Premium
Advertisement