Compute the per-worker communication volume for two classic AllReduce algorithms.
Signature: def compare_allreduce(n_bytes: int, n_workers: int) -> dict
Return {'ring_volume': int, 'tree_volume': int} — the bytes each worker sends/receives.
Formulas:
2 * n_bytes * (n_workers - 1) // n_workersExample:
n_bytes=1_000_000, n_workers=4 → {'ring_volume': 1500000, 'tree_volume': 1500000}Note: Both have the same bandwidth cost asymptotically; tree wins on latency (O(log n) vs O(n) steps).
Math
Asked at
Test Results