> ## Documentation Index
> Fetch the complete documentation index at: https://docs.seekr.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Training data attribution

> Identify which training examples influenced a fine-tuned model's response.

export const SupportedOn = ({ui = false, api = true, sdk = true}) => <div className="inline-flex flex-wrap items-center gap-x-5 gap-y-2 px-4 py-2.5 rounded-lg border border-[#00dad3] bg-[#00dad3]/10 text-sm not-prose">
    <span className="font-bold text-black dark:text-white whitespace-nowrap">
      Supported on
    </span>
    <div className="flex items-center gap-5">
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={ui ? "circle-check" : "circle-xmark"} color={ui ? "#00dad3" : "#9ca3af"} size={16} />
        UI
      </span>
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={api ? "circle-check" : "circle-xmark"} color={api ? "#00dad3" : "#9ca3af"} size={16} />
        API
      </span>
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={sdk ? "circle-check" : "circle-xmark"} color={sdk ? "#00dad3" : "#9ca3af"} size={16} />
        SDK
      </span>
    </div>
  </div>;

<SupportedOn ui={false} api={true} sdk={true} />

Training data attribution surfaces the training data that influenced fine-tuned model outputs. By tracing model responses back to specific question-answer pairs from the training dataset, it helps debug model behavior and audit responses.

<CardGroup>
  <Card title="Training data attribution SDK" icon="database" href="/flow/sdk/explainability/training-data-attribution">
    Retrieve influential fine-tuning examples programmatically using the Python SDK or REST API.
  </Card>
</CardGroup>

## How training data attribution works

When a fine-tuned model generates a response, training data attribution identifies the most influential training examples that shaped that output. Each influential example receives an influence level (high, medium, or low) indicating its contribution to the model's response.

## Requirements

Training data attribution is available for:

* Fine-tuned models created through SeekrFlow
* Models trained after September 22, 2025
* Deployed models with active endpoints

## Influence levels

Training examples are ranked by their influence on model outputs:

| Level  | Description                                         |
| ------ | --------------------------------------------------- |
| High   | Training example strongly shaped the model response |
| Medium | Training example had moderate impact on output      |
| Low    | Training example contributed minimally to response  |

Irrelevant training examples are filtered out and not returned.

## When to use training data attribution

**Debugging model behavior** – Identify which training examples drive unexpected or incorrect responses

**Auditing outputs** – Trace model decisions back to source training data for compliance and verification

**Dataset refinement** – Discover patterns in influential training examples to improve fine-tuning datasets

## Traceability

Training data attribution responses include file identifiers linking back to source documents. This connects model outputs to original training materials, supporting debugging and dataset updates.
