Given a kernel's arithmetic intensity and a device's peak compute and memory bandwidth, classify the kernel as 'compute_bound' or 'memory_bound'.
Signature: def roofline_classify(intensity: float, peak_tflops: float, peak_bw_gbps: float) -> str
The ridge point of the roofline is peak_tflops * 1000 / peak_bw_gbps FLOPs/byte. Kernels above the ridge are compute-bound; below it, memory-bound.
Example: A100 has peak_tflops=312, peak_bw_gbps=1555 → ridge ≈ 200 FLOPs/byte. Intensity 300 is compute-bound; intensity 10 is memory-bound.
Math
Asked at
Test Results