> ## Documentation Index
> Fetch the complete documentation index at: https://docs.seekr.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Deployments

> Launch, monitor, and manage model endpoints for real-time inference.

export const SupportedOn = ({ui = false, api = true, sdk = true}) => <div className="inline-flex flex-wrap items-center gap-x-5 gap-y-2 px-4 py-2.5 rounded-lg border border-[#00dad3] bg-[#00dad3]/10 text-sm not-prose">
    <span className="font-bold text-black dark:text-white whitespace-nowrap">
      Supported on
    </span>
    <div className="flex items-center gap-5">
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={ui ? "circle-check" : "circle-xmark"} color={ui ? "#00dad3" : "#9ca3af"} size={16} />
        UI
      </span>
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={api ? "circle-check" : "circle-xmark"} color={api ? "#00dad3" : "#9ca3af"} size={16} />
        API
      </span>
      <span className="inline-flex items-center gap-1.5 font-semibold text-black dark:text-white">
        <Icon icon={sdk ? "circle-check" : "circle-xmark"} color={sdk ? "#00dad3" : "#9ca3af"} size={16} />
        SDK
      </span>
    </div>
  </div>;

<SupportedOn ui={true} api={true} sdk={true} />

Deployments create and manage model endpoints for real-time inference. A deployment hosts a model (base or fine-tuned) on dedicated compute infrastructure, making it available for API requests, agent usage, or application integration.

<CardGroup>
  <Card title="Deployments UI guide" icon="grid" href="/flow/app/deployments">
    Create and manage deployments through the SeekrFlow web interface.
  </Card>

  <Card title="Deployments SDK guide" icon="code" href="/flow/sdk/deployments">
    Create and manage deployments programmatically with the Python SDK.
  </Card>
</CardGroup>

## How deployments work

Deployments provision model endpoints with specified compute resources. Once active, deployed models serve inference requests through the SeekrFlow API. Deployments can be paused to stop serving traffic or deleted to free resources.

## Deployment configuration

When creating a deployment, you configure:

* **Model selection** – Choose a base model or fine-tuned model checkpoint
* **Compute resources** – Specify instance count and hardware allocation
* **Endpoint details** – Define deployment name and description for identification

## Deployment status

Each deployment has a status indicating its current state:

| Status   | Description                                       |
| -------- | ------------------------------------------------- |
| Pending  | Deployment requested, provisioning infrastructure |
| Active   | Serving inference traffic                         |
| Inactive | Paused, not serving requests                      |
| Failed   | Error during startup or runtime                   |

## Using deployed models

Active deployments are accessed through two methods:

* **Agent integration** – Reference deployed models in agent configurations to give agents access to specific model capabilities
* **API endpoints** – Make direct inference calls via deployment endpoints to integrate models into custom applications or workflows

## Deployment management

Deployments support lifecycle operations:

* **Pause** – Stop serving traffic while preserving the endpoint
* **Resume** – Reactivate an inactive deployment
* **Delete** – Remove the deployment and free allocated resources

## Monitoring

Track deployment performance through:

* Token usage metrics (input and output)
* Active hours and uptime
* Event timeline showing deployment lifecycle events
