TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

93. MoE with Capacity Factor

Hard

Implement MoE routing with capacity factor: each expert can process at most capacity tokens, and overflow tokens are dropped (zero output).

Signature: def moe_capacity_routing(gate_probs, top_k, capacity_factor)

  • gate_probs: (N, n_experts) — softmax gating probabilities
  • top_k: number of experts per token
  • capacity_factor: float, multiplier for ideal load
  • Returns: (N, n_experts) binary assignment matrix (1 = token processed by expert)

Algorithm

Per-expert capacity is the ceiling of capacity_factor · N / n_experts — that's the maximum number of tokens any single expert is allowed to process.

Walk tokens in their input order. For each token, consider its top-k experts in order of descending gate probability (highest-probability expert first). For each candidate expert, if its current load is below capacity, accept the assignment (set the corresponding entry in the output matrix to 1) and increment that expert's load; otherwise skip this expert for this token. Tokens whose preferred experts are all full are silently dropped for those experts (output stays 0).

The processing order matters: the input-order traversal plus highest-prob-first per token is what reproducible test cases expect.

Capacity Factor Tradeoffs

  • capacity_factor = 1.0: each expert processes exactly N/E tokens on average. Tight — many tokens dropped under imbalance.
  • capacity_factor = 1.25: 25% slack. Common in Mixtral/Switch Transformer.
  • capacity_factor = 2.0: generous — rarely drops tokens but wastes compute.

Token dropping introduces bias during training (dropped tokens get zero gradient for that expert) but enables fixed-size batches needed for efficient GPU kernels.

Asked at

NumPy

import numpy as np

 

def moe_capacity_routing(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?