Skip to main content

Overview

In this guide, we’ll use Ollama, an open-source tool that makes it easy to download and run AI models locally.

1. Download Ollama

You can download the installer directly from Ollama.
Alternatively, use PowerShell to download and run the installer:
$url = 'https://ollama.com/download/OllamaSetup.exe'
$outputPath = 'C:\Downloads\OllamaSetup.exe'
Invoke-WebRequest -Uri $url -OutFile $outputPath
& $outputPath

2. Install Ollama

After downloading, double-click OllamaSetup.exe and follow the on-screen instructions to complete the installation.

3. Download Granite Models

Ollama supports a range of IBM Granite models. Larger models provide better results but require more resources. Use PowerShell to download Granite 4:
ollama pull granite4

4. Run Granite

To start chatting with Granite:
ollama run granite4
If you want to use a different variant, replace the model name (e.g., granite4).

5. Notes on Context Length

By default, Ollama runs models with a short context length to save memory.
For longer conversations, you can adjust it by setting:
/set parameter num_ctx <desired_context_length>

6. Using the API

You can also interact with Granite programmatically using Ollama’s OpenAI-compatible API:
curl -X POST http://localhost:11434/v1/chat/completions `
  -H "Content-Type: application/json" `
  -d '{
        "model": "granite4",
        "messages": [{"role": "user", "content": "How are you today?"}]
      }'
I