Given an array of model byte-sizes and an array of GPU VRAM byte-sizes, return a 2-D grid where entry [m, n] is the minimum number of GPUs needed to hold model m on GPUs of size n.
Implement: def min_gpus_grid(model_bytes, gpu_bytes) where:
model_bytes is shape (M,) — total bytes per model.gpu_bytes is shape (N,) — VRAM per GPU SKU.Return shape (M, N) of int64. Entry [m, n] = ceil(model_bytes[m] / gpu_bytes[n]).
The recipe (the all-pairs pattern from problem #222):
return np.ceil(model_bytes[:, None] / gpu_bytes[None, :]).astype(np.int64)
model_bytes[:, None] is shape (M, 1).gpu_bytes[None, :] is shape (1, N).(M, 1) / (1, N) broadcasts to (M, N).np.ceil rounds up; .astype(int64) makes it integer.This is the exact vectorization an SRE writes when sizing a heterogeneous GPU fleet across a model catalog. One expression, one heatmap.
Math
Asked at
import numpy as np
def min_gpus_grid(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?