Classify a kernel as compute-bound or memory-bound using the roofline model.
Signature: def roofline_classify(flops_per_byte: float, machine_balance: float) -> str
Where machine_balance = peak_flops / peak_bandwidth is the FLOPs-per-byte the hardware can sustain.
Return 'compute_bound' if flops_per_byte >= machine_balance, else 'memory_bound'.
Math
Asked at
import numpy as np
def roofline_classify(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?