Microscale
0
← back to the atlas
Act IX · Region 09

Ship It

Ollama, MLX, vLLM — from notebook to production

Practical deployment. Emulated terminals. Setup simulators. By the end you'll be able to answer 'which engine and what config?' for any scenario on the map.

badge · Shipwright
0 of 3 lessons completed
  1. 1
    Ollama in 60 seconds
    Install Ollama, pull a model, build a Modelfile with system prompts and parameters — the fastest path from zero to local LLM inference
  2. 2
    MLX-LM on Mac
    Apple's MLX framework for local LLM inference on M-series chips — unified memory, Metal acceleration, and the mlx-lm CLI
  3. 3
    vLLM in production
    Tensor parallelism, continuous batching, and PagedAttention in one config — deploy a production LLM endpoint with vLLM