> ## Documentation Index > Fetch the complete documentation index at: https://docs.seekr.com/llms.txt > Use this file to discover all available pages before exploring further. # AI-ready data > Generate structured training datasets from raw content for model fine-tuning. export const SupportedOn = ({ui = false, api = true, sdk = true}) =>

Supported on

UI API SDK

; AI-ready data generates and transforms datasets, converting raw content into training-ready formats. These jobs produce structured outputs optimized for model fine-tuning and alignment. ## How it works The AI-ready data pipeline transforms uploaded files into structured training datasets: **File selection** – Choose source files from storage to use as training material **Dataset type selection** – Specify the type of dataset to generate based on fine-tuning method **Generation** – The system processes files and creates structured question-and-answer pairs or training examples **Output** – Generated datasets are saved in formats compatible with fine-tuning workflows ## Integration with fine-tuning AI-ready datasets feed directly into fine-tuning workflows: * **Instruction fine-tuning** – Standard instruction datasets train models on task-specific examples * **Context-grounded fine-tuning** – Context-grounded datasets train models to use retrieval effectively * **Dataset quality** – Higher quality source content and generation produces better fine-tuned models The data engine automates the transition from raw files to training-ready datasets, reducing manual dataset preparation effort. ## Datasets for fine-tuning AI-ready data supports multiple dataset formats aligned to fine-tuning methods: ### Instruction fine-tuning Standard instruction datasets consist of traditional question-and-answer pairs aligned to task-specific instructions. Each example demonstrates how the model should respond to particular queries or prompts. **Structure:** * **Input** – The question, prompt, or instruction * **Output** – The expected response or completion **Use cases:** * Teaching domain-specific knowledge * Customizing response style and tone * Training task-specific behaviors These datasets are used with instruction fine-tuning to embed knowledge directly into model parameters. ### Context-grounded fine-tuning Context-grounded datasets consist of question-and-answer pairs that reference source documents. Each example includes the query, relevant context from source files, and the correct response grounded in that context. **Structure:** * **Query** – The question or prompt * **Context** – Relevant excerpts from source documents * **Response** – Answer derived from the provided context **Use cases:** * Training models to use external knowledge bases effectively * Teaching retrieval-aware response generation * Building models that cite sources and stay grounded in provided information These datasets are used with context-grounded fine-tuning to train models for retrieval-augmented generation workflows. ## Generation parameters Dataset generation can be configured with parameters that control output characteristics: * **Number of examples** – How many training pairs to generate from source content * **Diversity settings** – Controls for question variety and coverage across source material * **Quality filters** – Criteria for ensuring generated examples meet minimum standards ## Dataset quality Generated datasets are optimized for training effectiveness: * **Relevance** – Questions and answers are derived from actual source content * **Consistency** – Output format matches fine-tuning requirements * **Coverage** – Examples span the breadth of source material * **Validation** – Generated datasets can be reviewed before use in training ## Dataset management AI-ready datasets are managed alongside other data engine outputs: * **Status tracking** – Monitor generation job progress * **Review** – Inspect generated examples before fine-tuning * **Versioning** – Maintain multiple dataset versions from the same sources * **Export** – Download datasets in standard formats