Deployments
Deploy and manage model endpoints for real-time inference from the SeekrFlow Deployment Dashboard.
Deployment Dashboard
The Deployment Dashboard provides a real-time view of all your deployments and their performance. At the top of the page:
- Status — A donut chart showing total deployment count broken down by status: Active, Pending, Failed, and Inactive.
- Active Deployments — Count of currently active deployments and total active hours to date.
- Inference Tokens — Total input and output tokens consumed across all deployments to date.
The deployments table lists all deployments with the following columns: Deployment Name, Status, Model Type, Total Input Tokens, Total Output Tokens, and Last Modified.
Hover a row to reveal inline actions. Available actions depend on status:
| Status | Meaning | Available actions |
|---|---|---|
| Pending | Deployment requested, provisioning underway. | Pause · Delete |
| Active | Serving traffic. | Pause · Delete |
| Inactive | Paused; no inference traffic. | Resume · Delete |
| Failed | Error during start-up or runtime. | Delete |
Create a deployment
Click + Create Deployment to open a four-step wizard.
- Deployment Details — Name (up to 100 characters) and describe (up to 1000 characters) the deployment.
- Select Model — Choose a model to make available for inference. Use the Fine-Tuned Models tab to select from your completed fine-tuning jobs, or the Base Models tab for open-source and partner models. Each card shows the provider, model name, and license tags (Open Source, Warning).
- Select Hardware Configuration — Set the number of AMD Instinct MI300X instances for your deployment. You can choose up to 50 instances.
- Confirm and Start Deployment — Review your deployment details and model and hardware configuration. Accept the selected model's license terms, then click Start Deployment.
After submitting, a success screen confirms "Your model is deploying." The process may take a few minutes; track status in the Deployment Summary. Click Go to Deployment to navigate there directly, or navigate away — the deployment continues in the background.
Deployment Summary
Click any deployment name to open its summary page. The summary includes:
- Deployment Details — Name (editable), ID, Model Type, Status, Date Created, Date Deployed, and description (editable).
- Integrating Deployment for Inference — Quickstart Guide with instructions for setting up the SDK, sending API requests, and integrating your deployment.
- Data Sources & Training Prompt — Training file and prompt from the originating fine-tuning job.
- Model, Hardware, and Hyperparameters — The exact configuration running for this deployment.
- Fine-Tuning Job Details — Link back to the source fine-tuning job.
- Agent Details — Agents linked to this deployment and their statuses, if applicable.
- Event Timeline — Right-hand feed of deployment lifecycle events (e.g., Deployment Pending, Deployment Complete).
Pause and resume a deployment
Click Pause on the summary page or from the inline row action in the dashboard to stop traffic without deleting the endpoint. Resume from the deployments table when ready to serve traffic again.
Delete a deployment
Deployments can be deleted from the inline row actions in the deployments table.
Updated 5 days ago
