MicroscaleLabs
0
Back to labs
Lab 0990 minCPU · Colab

Quantize It Yourself

Act VII · Packing for Travel
the aha moment

Take a single 3072×3072 weight tensor from Qwen3-0.6B and implement three quantisation schemes from scratch — naive 4-bit uniform, NF4 quantile-binned, K-quant Q4_K_M with sub-block scales. Measure L2 error for each. Watch naive lose 3× to NF4, and NF4 lose 2× to Q4_K_M. The hierarchy of quantisation tricks is now a chart you built.

Open in ColabView on GitHub
the facts
Time
90 min
Hardware
CPU · Colab
Act
VII · Packing for Travel
Status
Live
Artifact
Three quantised tensor files + an error-comparison chart.
run it locally

Clone the labs repo and run this lab as a script or open it as a notebook:

git clone https://github.com/iqbal-sk/Microscale-labs.git
cd Microscale
just setup-auto      # auto-detects CPU / CUDA / Mac
just run 09
# or:  jupyter lab labs/09-quantize-it-yourself/lab.py

Full install options (uv, pip, or the platform-specific CUDA paths) are in the labs README.

read alongside
Open in ColabView on GitHub← all labs