IBM watsonx
IBM watsonx.ai is an enterprise-grade
AI studio that gives AI builders a complete developer toolkit of APIs, tools,
models, and runtimes, to support the rapid adoption of AI use-cases, from data
through deployment using pro-code and low-code tools. On watsonx.ai you can
experience the new Granite dense models, as well as the guardian models in a
secure, reliable environment for your use cases. Try watsonx.ai for free today.
Supported models can be found
here.
Azure AI Foundry
Azure AI Foundry is a flexible, secure, enterprise-grade AI platform that empowers enterprises, start-ups, and software development companies to bring AI apps and agents to production fast. Access IBM’s Granite models today with the developer tools you already know, like GitHub, Visual Studio, and Copilot Studio, to build powerful solutions that can optimize call center performance, analyze data, improve product discovery, validate code, automate workflows, and more.
Supported models can be found here.
Ollama
Ollama lets users run large language models
directly on their devices, with flexible deployment options across cloud
providers. IBM and Ollama are partnering to bring IBM’s Granite models to users
and organizations. Whether on x86 CPUs, GPUs from AMD, Intel, and Nvidia, ARM
devices from Qualcomm, Apple M-series MacBooks, IBM Cloud, other major cloud
platforms, or even a Raspberry Pi, Ollama makes Granite models accessible to the
devices that users and organizations depend on.
Supported models can be found here.
By default, Ollama runs models with a short context length to avoid preallocating excessive RAM. This can cause long requests to be truncated. To override this in the API, add the following to the request body: “options”: {"num_ctx": <desired_context_length>}
. The largest <desired_context_length>
supported by Granite 3.1 models is 131072
(128k).
Replicate
Replicate makes machine learning easy. With one
line of code, you can run AI models for tasks like image generation, text
processing, and audio creation.We host thousands of models and handle the
infrastructure, so you can focus on building cool stuff. You pay only for the
compute time you use.Our goal is simple: simplify machine learning so you can
create cool stuff. We’re making AI accessible to everyone.
Supported models can be found here.
LM Studio
LM Studio is a beginner-friendly desktop application for
running Large Language Models (LLMs) and Embedding models locally on Mac,
Windows, and Linux. It provides both a familiar chat interface and a built-in
local server for using models programmatically in your code. The app features
easy model browsing and downloading from Hugging Face, plus the ability to chat
with PDF and Word documents.
Supported models can be found here.
Docker Hub
Docker Hub is the world’s largest container registry for storing, managing, and sharing images. You can pull IBM’s Granite models directly from Docker Hub and run them locally with Docker Model Runner with the same simplicity and portability that you can expect from running your Docker containers.
Supported models can be found here
Qualcomm
Qualcomm AI Hub is a developer-centric platform
designed to streamline on-device AI model development and deployment, empowering
developers to test and validate model performance. It automates model
compilation and inference, testing on real devices. The platform also provides a
library of over 100 pre-optimized models for Qualcomm and Snapdragon platforms,
ensuring superior performance, and reduced memory usage.
Supported models can be found
here.
Dell
Dell Pro AI Studio (DPAIS) is a complimentary toolkit available with the purchase of your Dell PC. Users can also experience the power of IBM’s Granite 4 models, now available on the Dell Enterprise Hub on Hugging Face. Seamlessly deploy cutting-edge AI with Dell’s validated infrastructure, unlocking long-context efficiency, memory optimization, and enterprise-grade innovation—all tailored to your business needs. Start building AI applications with IBM Granite models faster on Dell Enterprise Hub and enable your IT administrators to manage your AI more easily with DPAIS.
Supported models can be found here.
AWS
AWS customers can discover and subscribe to IBM Granite models from various deployment options depending on their preference. Access IBM Granite models from the Amazon Bedrock Marketplace and deploy on managed endpoints through the Amazon Bedrock console. Through AWS Marketplace, organizations can subscribe to and deploy IBM Granite models to Amazon SageMaker AI, maintaining control over their data and compute resources. Data scientists can also access IBM Granite models via Amazon SageMaker JumpStart for easy deployment within SageMaker Studio.
Supported models can be found here.
NVIDIA NIM
IBM has partnered with NVIDIA to offer the Granite models on
NVIDIA NIM - a set of easy-to-use
microservices designed for secure, reliable deployment of high performance AI
model inferencing across clouds, data centers and workstations. You can
experience these models as NVIDIA-hosted APIs using free NVIDIA cloud credits
from ai.nvidia.com.
Supported models can be found here.