Given the per-token hidden states from a transformer encoder, produce a single vector per sequence using one of three pooling strategies.
Signature: def pool(hidden: np.ndarray, mode: str) -> np.ndarray
hidden: shape (batch, seq, d)mode is one of:
'cls' → take hidden[:, 0, :] (the leading [CLS] token)'mean' → average across the sequence axis'max' → element-wise max across the sequence axisReturn shape: (batch, d).
Math
Asked at
import numpy as np
def pool(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?