Overview
In this guide, we’ll use Ollama, an open-source tool that makes it easy to download and run AI models locally.1. Install Ollama
The easiest way is to install the desktop app.You can also install it via Homebrew:
2. Start Ollama
Once installed, start the Ollama service:3. Download Granite Models
Ollama supports a range of IBM Granite models. Larger models give better results but require more resources. To download Granite 4:4. Run Granite
To start chatting with Granite:granite4
).
5. Notes on Context Length
By default, Ollama runs models with a short context length to save memory.
For longer conversations, you can adjust it by setting:The largest supported context for Granite 3.1 models is
For longer conversations, you can adjust it by setting:
131072
(128k).