TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
TorchedUp
LearnBetaProblemsSystem DesignSoonPremium
←

21. Rotary Position Embedding (RoPE)

Medium

Implement Rotary Position Embedding (RoPE), used in LLaMA, GPT-NeoX, and most modern LLMs. RoPE encodes position information by rotating query/key vectors by position-dependent angles — unlike additive sinusoidal embeddings, RoPE directly modifies the dot product between queries and keys.

Signature: def rope(x: np.ndarray, positions: np.ndarray) -> np.ndarray

  • x: shape (seq_len, d_model) — query or key vectors
  • positions: shape (seq_len,) — integer position indices
  • Returns: shape (seq_len, d_model) — rotated vectors

For a d-dimensional vector x at position m, split into pairs (x_{2i}, x_{2i+1}). The rotation angle for dimension pair i is θ_i = 1 / 10000^(2i/d). Apply:

x_rot[2i]   = x[2i]   * cos(m * θ_i) - x[2i+1] * sin(m * θ_i)
x_rot[2i+1] = x[2i]   * sin(m * θ_i) + x[2i+1] * cos(m * θ_i)

Math

θi​=100002i/d1​,i=0,1,…,d/2−1(x2i′​x2i+1′​​)=(cos(mθi​)sin(mθi​)​−sin(mθi​)cos(mθi​)​)(x2i​x2i+1​​)

Related problems

  • Rotary Position Embedding (PyTorch)mediumPyTorch

Asked at

NumPy

import numpy as np

 

def rope(...):

    pass

🔒

Premium problem

Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.

Upgrade to PremiumBack to problems

Already premium?