Deployment Dashboard
The Deployment Dashboard provides a real-time view of all your deployments and their performance. At the top of the page:- Status — A donut chart showing total deployment count broken down by status: Active, Pending, Failed, and Inactive.
- Active Deployments — Count of currently active deployments and total active hours to date.
- Inference Tokens — Total input and output tokens consumed across all deployments to date.
| Status | Meaning | Available actions |
|---|---|---|
| Pending | Deployment requested, provisioning underway. | Pause · Delete |
| Active | Serving traffic. | Pause · Delete |
| Inactive | Paused; no inference traffic. | Resume · Delete |
| Failed | Error during start-up or runtime. | Delete |
Create a deployment
Click + Create Deployment to open a four-step wizard.Deployment Details — Name (up to 100 characters) and describe (up to 1000 characters) the deployment.
Select Model — Choose a model to make available for inference. Use the Fine-Tuned Models tab to select from your completed fine-tuning jobs, or the Base Models tab for open-source and partner models. Each card shows the provider, model name, and license tags (Open Source, Warning).
Select Hardware Configuration — Set the number of AMD Instinct MI300X instances for your deployment. You can choose up to 50 instances.
Deployment Summary
Click any deployment name to open its summary page. The summary includes:- Deployment Details — Name (editable), ID, Model Type, Status, Date Created, Date Deployed, and description (editable).
- Integrating Deployment for Inference — Quickstart Guide with instructions for setting up the SDK, sending API requests, and integrating your deployment.
- Data Sources & Training Prompt — Training file and prompt from the originating fine-tuning job.
- Model, Hardware, and Hyperparameters — The exact configuration running for this deployment.
- Fine-Tuning Job Details — Link back to the source fine-tuning job.
- Agent Details — Agents linked to this deployment and their statuses, if applicable.
- Event Timeline — Right-hand feed of deployment lifecycle events (e.g., Deployment Pending, Deployment Complete).