Weight Memory Sweep

You have an array of model sizes and an array of dtype byte-widths (one per model). Return the per-config weight memory in bytes as an int64 array.

Implement: def weight_memory_sweep(n_params, dtype_bytes) where both are 1-D arrays of shape (N,). Return shape (N,) of int64 byte counts.

Formula (per config): bytes[i] = n_params[i] * dtype_bytes[i].

The trap: the obvious-but-wrong implementation is a Python for loop:

out = []
for n, b in zip(n_params, dtype_bytes):
    out.append(n * b)
return np.array(out)

That works, but it is O(N) Python-level iterations. The numpy way is one expression:

return (np.asarray(n_params) * np.asarray(dtype_bytes)).astype(np.int64)

This delegates the loop to compiled C, runs in microseconds even for thousands of configs, and is the same shape of code you'll write for every sweep in this track.

Why int64? Parameter counts are billions; products can exceed 2³¹. Cast explicitly so you don't get silent overflow on 32-bit defaults.

Math

bytes_{i} = N_{i} \cdot b_{i}

Asked at