Content Moderation Overview

This page provides a high-level overview of all content moderation models available in the SeekrFlow Model Library, including what each model can and cannot do, their ideal use cases, and how to get started.

Summary

SeekrFlow provides access to multiple content moderation models — each built for different content types, contexts, and levels of flexibility.

These models support classification of:

Podcast conversations (via diarized transcripts)
LLM outputs (chat, text, completions)
General user-generated content (comments, prompts, responses)

Not all models are multi-purpose.
Some, like Seekr ContentGuard, are highly domain-specific and require strict formatting. Others, like Meta Llama Guard, are flexible and general-purpose.

Available Moderation Models

1. Seekr ContentGuard

Purpose-built for: Podcast Transcript Scoring
This model is not general-purpose. It is specifically designed to analyze short, diarized, audio-derived chunks of podcast transcripts to assess:

GARM Brand Safety Risk (13 categories)
Civility / Hostility Score (attack type per chunk)

Use if you need to:

Score entire podcast episodes for brand suitability
Detect offensive tone in conversational spoken-word content

Do not use for:

Blog posts, LLM responses, web comments, or general text
Non-podcast content or full-episode input

Read Full Guide →

2. Meta Llama Guard 3 (8B)

General-purpose text moderation model
Designed by Meta, this model supports all types of text, including:

LLM prompts and completions
Chatbot messages
User input and responses
Web content, app text, comment sections

It classifies content using the MLCommons 22-category taxonomy and flags messages as either safe or unsafe for each category.

Use if you need to:

Add guardrails to LLMs or agents
Moderate general-purpose text input/output
Screen for safety risks across any user-generated content

Not suitable for:

Podcast-specific brand safety scoring
Civility / tone detection (does not include Seekr’s civility score)

Read Full Guide →

Choosing the Right Model

Use Case	Recommended Model
Podcast episode moderation (spoken audio)	Seekr ContentGuard
Brand safety scoring for podcast content	Seekr ContentGuard
Civility / tone detection in podcasts	Seekr ContentGuard
LLM output filtering	Meta Llama Guard 3
Agent guardrails or safety classifiers	Meta Llama Guard 3
Moderating general text, prompts, or chats	Meta Llama Guard 3

Comparison Table

Feature / Capability	Seekr ContentGuard	Meta Llama Guard 3
Hosted Model	✅ Yes	✅ Yes
Best for	Podcast transcripts	General-purpose text
Input Required	Short, diarized audio chunks	Any plain text
Supports GARM	✅ Yes (13 categories)	❌ No
Supports Civility scoring	✅ Yes	❌ No
Supports MLCommons Taxonomy	❌ No	✅ Yes (22 categories)
Safe/Unsafe Binary Labels	❌ No	✅ Yes
Works on blog/chat/LLM output	❌ No	✅ Yes
Multi-purpose	❌ No	✅ Yes

How to Get Started

Visit the Model Library
Deploy the moderation model that fits your use case
Use the SDK or API to send content for classification
Interpret the results (e.g., flag, block, score, analyze)
Optionally build your own pipeline for ingestion, storage, and aggregation

Content Moderation models