HAP Detection

This recipe showcases a Granite Guardian model designed to detect hate, abuse, and profanity, either in a prompt or LLM output. This is an example of a “guard rail” used in generative AI applications for safety. The model used in this recipe has been fine-tuned on several English HAP benchmarks and utilizes the slate.38m.english.distilled base model. You will need a Hugging Face token to run this recipe in Colab. Instructions for obtaining this credential can be found here.

HAP Detection examples may contain profanities.

Get started

Explore sample code in a GitHub repo

https://mintcdn.com/ibmgranite/hTOLCCRXWQna3TLe/granite/docs/images/icons8-google-colab.svg?fit=max&auto=format&n=hTOLCCRXWQna3TLe&q=85&s=d16fe18f498182824a7237924181f7ad

Try it out

Execute sample code in Colab

Introduction

Granite Language Cookbook

Granite Vision Cookbook

Granite Speech Cookbook

Granite Guardian Cookbook

Granite Time Series Cookbook

Granite Code Cookbook

Agentic AI Cookbook

Model Evaluation Cookbook

Additional Cookbooks

Get started

Try it out