Deployments (UI)
The Deployments module lets you launch, monitor, and manage model endpoints for real‑time inference—all from a single dashboard.
Deployment Dashboard overview

Region | Purpose |
---|---|
Summary Cards | High‑level metrics:• Status ring with counts (Active, Pending, Failed, Inactive)• Active Deployments & Active Hours• Inference Tokens (Input/Output, to‑date) |
Toolbar | Learn More link and + Create Deployment button. Launches the deployment wizard (see §3). |
Deployments Table | Columns:• Deployment Name (click for summary)• Status badge• Model Type (Fine‑Tuned / Base Model)• Total Input Tokens / Output Tokens• Last Modified Hover a row to reveal pause ▮▮, resume ▶︎, or delete 🗑 icons (availability depends on status). |
Status badges & available actions
Status | Meaning | Allowed Actions |
---|---|---|
Pending (yellow | Deployment requested, provisioning underway. | ▮▮ Pause · 🗑 Delete |
Active (green) | Serving traffic. | ▮▮ Pause · 🗑 Delete |
Inactive (grey) | Paused; no inference traffic. | ▶︎ Start · 🗑 Delete |
Failed (red) | Error during start‑up or runtime. | 🗑 Delete |
Hovering an action icon shows a tooltip (e.g., “Cannot Delete When Active”).
Creating a deployment (wizard)
Click + Create Deployment to open a four‑step wizard.
Step | Task | Notes |
---|---|---|
Deployment Details | Name and describe the endpoint. | Text fields (≤100 / 1,000 chars). |
Select Model | Choose a checkpoint:• Fine‑Tuned Models tab lists your jobs.• Base Models tab surfaces curated open source and partner models. | Card shows model name, size, timestamp, warnings (e.g., license). |
Hardware | Set Number of Instances (1–50) on compute hardware | |
Review and Start | Confirm details, then Start Deployment. |




A success screen (“Your model is deploying 🚀”) closes the wizard; the new row appears with Pending status.
Deployment Summary page

Click any Deployment Name to open its summary.
- Deployment Details – Name ✏️, ID, Model Type, Status, timestamps, editable description.
- Integrating Deployment for Inference – Quickstart Guide link to SDK examples.
- Data Sources & Training Prompt – Pulled from originating fine‑tuning job (traceability).
- Model, Hardware, Hyperparameters – Exactly what’s running.
- Fine‑Tuning Job Details – Click to cross‑navigate back.
- Agent Details (if linked) – Shows agents referencing this deployment with their own statuses.
- Event Timeline – Right‑hand feed of deployment lifecycle events.
- Pause button (top right) – Instantly stops traffic without deleting the endpoint.
Updated about 14 hours ago