181 hands-on problems for the math and systems behind modern AI. Python in your browser. No setup.
181
coding problems
42
hard-difficulty
14
topic areas
Implement softmax that handles large values without overflow. Hint: subtract max(x) before exp to prevent overflow.
Click Run to load Python 3.12
Solved this? 169 more problems waiting →
Softmax, attention, backprop, Adam — the building blocks every MLE must know.
RoPE, Flash Attention, GQA, MoE — implement the architectures powering GPT-4, LLaMA, and Gemini.
KV cache, PagedAttention, speculative decoding, continuous batching — vLLM internals as coding problems.
ZeRO, FSDP, Megatron parallelism, pipeline parallelism — what runs at the largest scale.
Walk through the full design of recsys, ranking, search, fraud, ads pipelines — feature stores, training infra, deployment, monitoring. Lessons + interactive case studies.
How to design and scale LLM serving — vLLM internals, paged attention, continuous batching, prefix caching, multi-LoRA, tensor parallel deployment, capacity planning.
Runs in your browser
Python 3.12 via Pyodide — no installs, no waiting for a sandbox VM. Write code, click Run, see results in milliseconds.
Interview-accurate
Problems are drawn from real ML engineering interviews at top labs. Math formulations and reference solutions included.
Covers the full stack
From softmax to PagedAttention — numpy fundamentals through production-grade distributed training and serving systems.
Free to start. Premium unlocks solutions, hints, and unlimited runs.
Browse all 181 problems →