TorchedUp
ProblemsPremium
TorchedUp
Ring vs Tree AllReduceMedium
ProblemsPremium

Ring vs Tree AllReduce

Compute the per-worker communication volume for two classic AllReduce algorithms.

Signature: def compare_allreduce(n_bytes: int, n_workers: int) -> dict

Return {'ring_volume': int, 'tree_volume': int} — the bytes each worker sends/receives.

Formulas:

  • Ring AllReduce: 2 * n_bytes * (n_workers - 1) // n_workers
  • Tree (recursive halving + doubling): same volume, different latency profile

Example:

  • n_bytes=1_000_000, n_workers=4 → {'ring_volume': 1500000, 'tree_volume': 1500000}

Note: Both have the same bandwidth cost asymptotically; tree wins on latency (O(log n) vs O(n) steps).

Math

Asked at

Python (numpy)0/3 runs today

Test Results

○4 workers, 1MB
○8 workers, 1MB
○2 workers🔒 Premium
Advertisement