TorchedUp
ProblemsPremium
TorchedUp
PyTorch: Dataset & DataLoaderMedium
ProblemsPremium

PyTorch: Dataset & DataLoader

Implement a custom Dataset and use DataLoader to produce batches. Return the size of each batch to verify the batching behavior.

Signature: def create_batches(data, labels, batch_size, shuffle=False)

  • data: list of feature vectors (list of lists)
  • labels: list of ints
  • batch_size: int
  • shuffle: bool (use False for deterministic tests)
  • Returns: list of ints — the number of samples in each batch

Implement SimpleDataset(Dataset) with:

  • __init__(self, data, labels): store as float32 / long tensors
  • __len__(self): return dataset size
  • __getitem__(self, idx): return (data[idx], labels[idx])

Then build DataLoader(dataset, batch_size=batch_size, shuffle=shuffle) and return [len(bx) for bx, _ in loader].

Why? DataLoader handles batching, shuffling, and worker processes — core infrastructure for any training loop.

Asked at

Python (numpy)0/3 runs today

Test Results

○4 samples, batch_size=2 → 2 full batches
○5 samples, batch_size=3 → one full + one partial
○batch_size larger than dataset🔒 Premium
○6 samples, batch_size=2 → 3 batches🔒 Premium
Advertisement