MicroscaleLabs
0
Back to labs
Lab 0590–120 minGPU · Mac · Colab · CPU

The $1 Pretraining Run

Act IV · How They Learn
the aha moment

Train a 10M-parameter GPT-2 from scratch on TinyStories for ~20 minutes, watch the loss curve descend from random noise to coherent English, then train a second copy on corrupted data and see the textbook hypothesis as a measured gap. Compute cost: well under $1.

Open in ColabView on GitHub
the facts
Time
90–120 min
Hardware
GPU · Mac · Colab · CPU
Act
IV · How They Learn
Status
Live
Artifact
Two trained 10M-param models + loss curves + side-by-side generation samples.
run it locally

Clone the labs repo and run this lab as a script or open it as a notebook:

git clone https://github.com/iqbal-sk/Microscale-labs.git
cd Microscale
just setup-auto      # auto-detects CPU / CUDA / Mac
just run 05
# or:  jupyter lab labs/05-dollar-pretraining/lab.py

Full install options (uv, pip, or the platform-specific CUDA paths) are in the labs README.

read alongside
Open in ColabView on GitHub← all labs