Cloud

IBM watsonx

IBM watsonx.ai is an enterprise-grade AI studio that gives AI builders a complete developer toolkit of APIs, tools, models, and runtimes, to support the rapid adoption of AI use-cases, from data through deployment using pro-code and low-code tools. On watsonx.ai you can experience the new Granite dense models, as well as the guardian models in a secure, reliable environment for your use cases. Try watsonx.ai for free today. Supported models can be found here.

Azure AI Foundry

Azure AI Foundry is a flexible, secure, enterprise-grade AI platform that empowers enterprises, start-ups, and software development companies to bring AI apps and agents to production fast. Access IBM’s Granite models today with the developer tools you already know, like GitHub, Visual Studio, and Copilot Studio, to build powerful solutions that can optimize call center performance, analyze data, improve product discovery, validate code, automate workflows, and more. Supported models can be found here.

Ollama

Ollama lets users run large language models directly on their devices, with flexible deployment options across cloud providers. IBM and Ollama are partnering to bring IBM’s Granite models to users and organizations. Whether on x86 CPUs, GPUs from AMD, Intel, and Nvidia, ARM devices from Qualcomm, Apple M-series MacBooks, IBM Cloud, other major cloud platforms, or even a Raspberry Pi, Ollama makes Granite models accessible to the devices that users and organizations depend on. Supported models can be found here.

By default, Ollama runs models with a short context length to avoid preallocating excessive RAM. This can cause long requests to be truncated. To override this in the API, add the following to the request body: “options”: {"num_ctx": <desired_context_length>}. The largest <desired_context_length> supported by Granite 3.1 models is 131072 (128k).

Replicate

Replicate makes machine learning easy. With one line of code, you can run AI models for tasks like image generation, text processing, and audio creation.We host thousands of models and handle the infrastructure, so you can focus on building cool stuff. You pay only for the compute time you use.Our goal is simple: simplify machine learning so you can create cool stuff. We’re making AI accessible to everyone. Supported models can be found here.

LM Studio

LM Studio is a beginner-friendly desktop application for running Large Language Models (LLMs) and Embedding models locally on Mac, Windows, and Linux. It provides both a familiar chat interface and a built-in local server for using models programmatically in your code. The app features easy model browsing and downloading from Hugging Face, plus the ability to chat with PDF and Word documents. Supported models can be found here.

Docker Hub

Docker Hub is the world’s largest container registry for storing, managing, and sharing images. You can pull IBM’s Granite models directly from Docker Hub and run them locally with Docker Model Runner with the same simplicity and portability that you can expect from running your Docker containers. Supported models can be found here

Qualcomm

Qualcomm AI Hub is a developer-centric platform designed to streamline on-device AI model development and deployment, empowering developers to test and validate model performance. It automates model compilation and inference, testing on real devices. The platform also provides a library of over 100 pre-optimized models for Qualcomm and Snapdragon platforms, ensuring superior performance, and reduced memory usage. Supported models can be found here.

Dell

Dell Pro AI Studio (DPAIS) is a complimentary toolkit available with the purchase of your Dell PC. Users can also experience the power of IBM’s Granite 4 models, now available on the Dell Enterprise Hub on Hugging Face. Seamlessly deploy cutting-edge AI with Dell’s validated infrastructure, unlocking long-context efficiency, memory optimization, and enterprise-grade innovation—all tailored to your business needs. Start building AI applications with IBM Granite models faster on Dell Enterprise Hub and enable your IT administrators to manage your AI more easily with DPAIS. Supported models can be found here.

AWS

AWS customers can discover and subscribe to IBM Granite models from various deployment options depending on their preference. Access IBM Granite models from the Amazon Bedrock Marketplace and deploy on managed endpoints through the Amazon Bedrock console. Through AWS Marketplace, organizations can subscribe to and deploy IBM Granite models to Amazon SageMaker AI, maintaining control over their data and compute resources. Data scientists can also access IBM Granite models via Amazon SageMaker JumpStart for easy deployment within SageMaker Studio. Supported models can be found here.

NVIDIA NIM

IBM has partnered with NVIDIA to offer the Granite models on NVIDIA NIM - a set of easy-to-use microservices designed for secure, reliable deployment of high performance AI model inferencing across clouds, data centers and workstations. You can experience these models as NVIDIA-hosted APIs using free NVIDIA cloud credits from ai.nvidia.com. Supported models can be found here.

Models

Run Granite

Responsible AI

IBM watsonx

Azure AI Foundry

Ollama

Replicate

LM Studio

Docker Hub

Qualcomm

Dell

AWS

NVIDIA NIM

Models

Run Granite

Responsible AI

​IBM watsonx

​Azure AI Foundry

​Ollama

​Replicate

​LM Studio

​Docker Hub

​Qualcomm

​Dell

​AWS

​NVIDIA NIM

IBM watsonx

Azure AI Foundry

Ollama

Replicate

LM Studio

Docker Hub

Qualcomm

Dell

AWS

NVIDIA NIM