Content Moderation models
This page is about how to use SeekrFlow Content Moderation models.
Content Moderation Overview
This page provides a high-level overview of all content moderation models available in the SeekrFlow Model Library, including what each model can and cannot do, their ideal use cases, and how to get started.
Summary
SeekrFlow provides access to multiple content moderation models — each built for different content types, contexts, and levels of flexibility.
These models support classification of:
- Podcast conversations (via diarized transcripts)
- LLM outputs (chat, text, completions)
- General user-generated content (comments, prompts, responses)
Not all models are multi-purpose.
Some, like Seekr ContentGuard
, are highly domain-specific and require strict formatting. Others, like Meta Llama Guard
, are flexible and general-purpose.

Available Moderation Models
1. Seekr ContentGuard
Purpose-built for: Podcast Transcript Scoring
This model is not general-purpose. It is specifically designed to analyze short, diarized, audio-derived chunks of podcast transcripts to assess:
- GARM Brand Safety Risk (13 categories)
- Civility / Hostility Score (attack type per chunk)
Use if you need to:
- Score entire podcast episodes for brand suitability
- Detect offensive tone in conversational spoken-word content
Do not use for:
- Blog posts, LLM responses, web comments, or general text
- Non-podcast content or full-episode input
2. Meta Llama Guard 3 (8B)
General-purpose text moderation model
Designed by Meta, this model supports all types of text, including:
- LLM prompts and completions
- Chatbot messages
- User input and responses
- Web content, app text, comment sections
It classifies content using the MLCommons 22-category taxonomy and flags messages as either safe or unsafe for each category.
Use if you need to:
- Add guardrails to LLMs or agents
- Moderate general-purpose text input/output
- Screen for safety risks across any user-generated content
Not suitable for:
- Podcast-specific brand safety scoring
- Civility / tone detection (does not include Seekr’s civility score)
Choosing the Right Model
Use Case | Recommended Model |
---|---|
Podcast episode moderation (spoken audio) | Seekr ContentGuard |
Brand safety scoring for podcast content | Seekr ContentGuard |
Civility / tone detection in podcasts | Seekr ContentGuard |
LLM output filtering | Meta Llama Guard 3 |
Agent guardrails or safety classifiers | Meta Llama Guard 3 |
Moderating general text, prompts, or chats | Meta Llama Guard 3 |
Comparison Table
Feature / Capability | Seekr ContentGuard | Meta Llama Guard 3 |
---|---|---|
Hosted Model | ✅ Yes | ✅ Yes |
Best for | Podcast transcripts | General-purpose text |
Input Required | Short, diarized audio chunks | Any plain text |
Supports GARM | ✅ Yes (13 categories) | ❌ No |
Supports Civility scoring | ✅ Yes | ❌ No |
Supports MLCommons Taxonomy | ❌ No | ✅ Yes (22 categories) |
Safe/Unsafe Binary Labels | ❌ No | ✅ Yes |
Works on blog/chat/LLM output | ❌ No | ✅ Yes |
Multi-purpose | ❌ No | ✅ Yes |
How to Get Started
- Visit the Model Library
- Deploy the moderation model that fits your use case
- Use the SDK or API to send content for classification
- Interpret the results (e.g., flag, block, score, analyze)
- Optionally build your own pipeline for ingestion, storage, and aggregation
Learn More
Updated 1 day ago