Deployments (UI)

The Deployments module lets you launch, monitor, and manage model endpoints for real‑time inference—all from a single dashboard.

Deployment Dashboard overview


Region

Purpose

Summary Cards

High‑level metrics:• Status ring with counts (Active, Pending, Failed, Inactive)• Active Deployments & Active Hours• Inference Tokens (Input/Output, to‑date)

Toolbar

Learn More link and + Create Deployment button. Launches the deployment wizard (see §3).

Deployments Table

Columns:• Deployment Name (click for summary)• Status badge• Model Type (Fine‑Tuned / Base Model)• Total Input Tokens / Output Tokens• Last Modified

Hover a row to reveal pause ▮▮, resume ▶︎, or delete 🗑 icons (availability depends on status).

Status badges & available actions

StatusMeaningAllowed Actions
Pending (yellowDeployment requested, provisioning underway.▮▮ Pause · 🗑 Delete
Active (green)Serving traffic.▮▮ Pause · 🗑 Delete
Inactive (grey)Paused; no inference traffic.▶︎ Start · 🗑 Delete
Failed (red)Error during start‑up or runtime.🗑 Delete

Hovering an action icon shows a tooltip (e.g., “Cannot Delete When Active”).

Creating a deployment (wizard)

Click + Create Deployment to open a four‑step wizard.

StepTaskNotes
Deployment DetailsName and describe the endpoint.Text fields (≤100 / 1,000 chars).
Select ModelChoose a checkpoint:• Fine‑Tuned Models tab lists your jobs.• Base Models tab surfaces curated open source and partner models.Card shows model name, size, timestamp, warnings (e.g., license).
HardwareSet Number of Instances (1–50) on compute hardware
Review and StartConfirm details, then Start Deployment.

A success screen (“Your model is deploying 🚀”) closes the wizard; the new row appears with Pending status.

Deployment Summary page

Click any Deployment Name to open its summary.

  • Deployment Details – Name ✏️, ID, Model Type, Status, timestamps, editable description.
  • Integrating Deployment for Inference – Quickstart Guide link to SDK examples.
  • Data Sources & Training Prompt – Pulled from originating fine‑tuning job (traceability).
  • Model, Hardware, Hyperparameters – Exactly what’s running.
  • Fine‑Tuning Job Details – Click to cross‑navigate back.
  • Agent Details (if linked) – Shows agents referencing this deployment with their own statuses.
  • Event Timeline – Right‑hand feed of deployment lifecycle events.
  • Pause button (top right) – Instantly stops traffic without deleting the endpoint.