TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

220. NumPy Broadcasting: Bias Add

Easy

In every transformer block you write x + b where x has shape (B, T, D) and b has shape (D,). The fact that this just works is the entire point of NumPy broadcasting — and it is the foundation for every vectorized formula you write later in this track.

Implement: def bias_add(x, b) returning x + b.

Shapes:

  • x: (B, T, D) — batch of T-token sequences with D-dim features
  • b: (D,) — per-feature bias
  • output: (B, T, D)

Why this works: NumPy aligns shapes from the right. (B, T, D) and (D,) align as (B, T, D) vs (_, _, D). Missing left dims are treated as size 1 and stretched. Trailing dim D matches in both, so the operation is legal.

You can also write x + b[None, None, :] to make the broadcast explicit — b[None, None, :] has shape (1, 1, D), which broadcasts identically.

The trap: x + b where b has shape (B,) does not broadcast — (B, T, D) aligned with (_, _, B) requires D == B. Bias on the wrong axis is one of the most common shape bugs in ML code; getting comfortable with broadcasting alignment is how you avoid it.

Math

yb,t,d​=xb,t,d​+bd​

Asked at

NumPy

import numpy as np

 

def bias_add(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?