> ## Documentation Index
> Fetch the complete documentation index at: https://docs.seekr.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data engine

> SDK workflows for uploading files, running ingestion, and creating data jobs for fine-tuning and semantic search.

The data engine transforms raw content into structured, AI-ready data. It manages the complete data lifecycle from file ingestion through preparation for training and retrieval workflows.

Fine-tuning data workflows are built around **data jobs** — managed operations that bundle file ingestion, Markdown review, prompt configuration, and alignment generation into a single tracked unit.

For a conceptual overview of the data engine and its capabilities, see [Data engine](/flow/components/data-engine).

## Data engine workflows

<CardGroup>
  <Card title="Prepare and ingest files" icon="upload" href="/flow/sdk/data-engine/file-ingestion">
    Upload source documents (PDF, DOCX, PPT, Markdown) and convert them to Markdown via the ingestion API.
  </Card>

  <Card title="Monitor ingestion" icon="chart-bar" href="/flow/sdk/data-engine/monitor-ingestion">
    Track ingestion progress through data job status, per-file records, and timeline events.
  </Card>

  <Card title="Create and populate a vector database" icon="database" href="/flow/sdk/data-engine/create-and-populate-a-vector-database">
    Set up a vector database and run document ingestion to generate embeddings for semantic search and retrieval.
  </Card>

  <Card title="Create instruction fine-tuning data" icon="sparkles" href="/flow/sdk/data-engine/standard-instruction-finetuning">
    Use a principle\_files data job to generate a QA pair dataset for instruction fine-tuning.
  </Card>

  <Card title="Create context-grounded fine-tuning data" icon="sparkles" href="/flow/sdk/data-engine/context-grounded-fine-tuning-data">
    Use context\_grounded\_files or context\_grounded\_vector\_db data jobs to generate training data grounded in an existing knowledge source.
  </Card>

  <Card title="Manage data jobs" icon="gear" href="/flow/sdk/data-engine/manage-data-jobs">
    List, filter, update metadata, and cancel data jobs.
  </Card>
</CardGroup>
