Overview
In this guide, we’ll use Ollama, an open-source tool that makes it easy to download and run AI models locally.1. Install Ollama
Install Ollama for Linux with:systemd
service named ollama.service
to run the server in the background.
The automatic installation script requires root access. For manual
installation (without root), see the manual installation
instructions.
2. Download Granite Models
Ollama supports a range of IBM Granite models. Larger models provide better results but require more resources. To download Granite 4:3. Run Granite
To start chatting with Granite:granite4
).
4. Notes on Context Length
By default, Ollama runs models with a short context length to save memory.
For longer conversations, you can adjust it by setting:
For longer conversations, you can adjust it by setting: