Given a chain of ops and a boolean per pair indicating whether they can fuse, compute the DRAM bytes used unfused vs fused, and the savings.
Signature: def fusion_bytes_saved(op_chain: list, can_fuse: list, dtype_bytes: int) -> list
op_chain: list of dicts {'in_size': int, 'out_size': int} (element counts)can_fuse[i]: True if op_chain[i] and op_chain[i+1] can fuse (length len(op_chain) - 1)(in_size + out_size) * dtype_bytesReturn [unfused_bytes, fused_bytes, savings_bytes] (all ints).
Example: 3-op chain, all fusable, each in_size=100, out_size=100, fp32 → unfused = 3*200*4 = 2400, fused = 200*4 = 800, savings 1600.
Math
Asked at
Test Results