Deployments
Deployments host a model on dedicated compute infrastructure and make it available for inference requests and agent usage. For conceptual background, see Deployments.
Create a deployment
Base model
from seekrai import SeekrFlow
from seekrai.types.deployments import DeploymentType
client = SeekrFlow()
deployment = client.deployments.create(
name="my-base-model-deployment",
description="Base model deployment for inference.",
model_type=DeploymentType.BASE_MODEL,
model_id="meta-llama/Llama-3.3-70B-Instruct",
n_instances=1
)
print(f"Deployment ID: {deployment.id}")
print(f"Status: {deployment.status}")Fine-tuned model
deployment = client.deployments.create(
name="my-fine-tuned-deployment",
description="Fine-tuned model deployment for inference.",
model_type=DeploymentType.FINE_TUNED_RUN,
model_id="ft-1234567890",
n_instances=1
)
print(f"Deployment ID: {deployment.id}")
print(f"Status: {deployment.status}")Parameters
| Parameter | Required | Description |
|---|---|---|
name | Yes | A name for the deployment. Must be 5–100 characters. |
description | Yes | A description of the deployment. Must be 5–1000 characters. |
model_type | Yes | DeploymentType.BASE_MODEL for a base model or DeploymentType.FINE_TUNED_RUN for a fine-tuned model. |
model_id | Yes | The model ID (base model name or fine-tuning job ID) to deploy. |
n_instances | Yes | Number of dedicated instances to provision. Must be between 1 and 50. |
Deployment status
| Status | Description |
|---|---|
Pending | Deployment requested, infrastructure provisioning in progress. |
Active | Serving inference traffic. |
Inactive | Paused, not serving requests. |
Failed | Error during startup or runtime. |
Promote a deployment
Promote a deployment to make it active and ready to serve inference requests.
deployment = client.deployments.promote(deployment.id)
print(f"Status: {deployment.status}")Demote a deployment
Demote a deployment to pause it without deleting the endpoint.
deployment = client.deployments.demote(deployment.id)
print(f"Status: {deployment.status}")List deployments
deployments = client.deployments.list()
for d in deployments.data:
print(f"{d.name} ({d.status}): {d.id}")Retrieve a deployment
deployment = client.deployments.retrieve("<deployment-id>")
print(f"{deployment.name}: {deployment.status}")