The mini-decoder forward pass below was working at one point, but a refactor broke something. On the very first forward pass the function blows up — either an index error from the embedding layer, or downstream a shape-mismatch when the position vectors are added to the token vectors.
Find and fix the bug(s) so the function correctly maps token ids to logits.
Signature: def decoder_forward(token_ids, tok_emb_w, pos_emb_w, W_out, b_out)
token_ids: list of length T, integer token ids in [0, vocab_size)tok_emb_w: token-embedding weight, nested list of shape (vocab_size, d_model)pos_emb_w: positional-embedding weight, nested list of shape (max_seq_len, d_model)W_out: output projection, nested list of shape (d_model, vocab_size)b_out: output bias, list of length vocab_sizeThe model uses vocab_size = 8, max_seq_len = 4, d_model = 8. The pipeline is:
nn.Embedding layers for tokens and positions, copy in the provided weightsnn.LayerNorm across the model dimensionReturn the logits as a nested list of shape (T, vocab_size).
Math
Related problems
Asked at
import numpy as np
def decoder_forward(...):
pass
Premium problem
Free accounts include problems #1–20. Upgrade to unlock the editor, hidden test cases, and reference solutions for every problem.
Already premium?