Quickstart

First install the seekrai package

pip install seekrai

Create your agent

An agent in this context is simply a configuration object that defines the tools, instructions, and model settings for an AI assistant - it's not a running process but rather a blueprint or template. The actual execution of tasks happens separately when you invoke this agent configuration, keeping the definition distinct from its runtime behavior.

from seekrai.types import CreateAgentRequest, ReasoningEffort
from seekrai import SeekrFlow
api_key = "your_api_key"
client = SeekrFlow(api_key=api_key)

agent = client.agents.create(
    CreateAgentRequest(
      name="homework_tutor_bot",
      instructions="You provide help with homework problems. Explain your reasoning at each step and include examples",
      model_id="meta-llama/Llama-3.1-8B-Instruct",
      tools=[],
			reasoning_effort=ReasoningEffort.SPEED_OPTIMIZED
    )
)
print(f"Agent created with ID: {agent.id}")
print(f"Agent status: {agent.status}")

Upon successful creation, the agent's status will show as 'PENDING'. The process to activate an agent generally takes a few minutes. The agent's status will show as 'Active' after it has been successfully promoted.

Check the status of your agent

available_agents = client.agents.list_agents()

print("Available agents:")
for agent in available_agents:
    print(f"ID: {agent.id}, Name: {agent.name}, Status: {agent.status}"

Create a thread

Threads are the core unit of execution in our platform. A thread represents the vehicle by which agent are executed

Agents are only executed by threads if the agent's status is Active. Attempting to run a thread with an inactive agent will result in an error.

Why threads?
Threads allow for concurrent and independent executions using the same agent configuration. This enables you to run multiple, isolated conversations in parallel, all with their own state and context history.

Messages and Input:
Messages are attached directly to threads. Each thread can maintain its own sequence of messages, completely independent from other threads using the same agent. The first message attached to a thread is the agent's initial input (user message, API response, etc)

Creating a Thread:
To create a thread, use the following code:

from seekrai import SeekrFlow
client = SeekrFlow()

thread = client.agents.threads.create()
print("Thread created: ", thread.id)

Create a message

Messages are the basic unit of agent context—they represent both user inputs and agent outputs. Messages are always attached to threads. The initial message serves as the starting input for the agent. As the agent executes, it appends additional messages (such as its own responses or follow-up prompts) to the thread, building up the conversation history over time.

Each message must include a thread ID, role, and content. Optional metadata—such as message ID, agent ID, run ID, and timestamp—can also be included for tracking and auditing purposes.

To create a message and attach it to a thread, use the following code:

message = client.agents.threads.create_message(
    thread_id=thread.id,
    role="user",
    content="Explain concept of derivatives for my calculus class "
)
print(f"Message created! ID: {message.id}, Content: {message.content}")

Execute your Agent

After your agent is Active, you've created a thread and attached a message to it, you are ready to run your agent. There are two ways to execute your agent:

Synchronous (Non-Streaming)

In synchronous mode (sometimes called “non-streaming”), the agent processes the entire request and returns the complete response only when generation is finished. This is the traditional, blocking request-response pattern—you send the input, wait, and then receive the full output at once.

agent_id = "agent_id"
thread_id = "thread_id"
original_message_id = "message_id"

run = client.agents.runs.run(
      agent_id=agent_id,
      thread_id=thread_id,
      stream=False
)
run_id=run.run_id
print(f"✓ Run started     → {run_id}")
    
while True:
	run = client.agents.runs.retrieve(run_id, thread_id=thread_id)
  if run.status in ['completed']:
       break
  time.sleep(1)
    
final_message = client.agents.threads.list_messages(thread_id)[0]
print(final_message.content)

Streaming

In streaming mode, the agent’s response is sent back incrementally as it’s generated, token by token. This allows your application to start displaying results in real time, reducing perceived latency and providing a smoother user experience—similar to how responses stream in the OpenAI API or tools like ChatGPT.

agent_id = "agent_id"
thread_id = "thread_id"
original_message_id = "message_id"

run = client.agents.runs.run(
      agent_id=agent_id,
      thread_id=thread_id,
      stream=False
			)
run_id = run.run_id
print(f"✓ Run started     → {run_id}")
print(f"✓ run object    → {run}")

# ─────────────────────────────────────────────────────────────────────────
# STREAM INTERMEDIATE OUTPUT
# Attach the streaming output to the thread 
# -------------------------------------------------------------------------
  
for _ in client.agents.runs.attach(run_id, thread_id):
    pass
    
# FETCH AGENT’S FINAL RESPONSE
# After streaming ends, we list every message in the thread and locate the final message
# To locate the final response, the user must save
#   1 - save the message id
#   2 - iterate through the messages in a paricular thread
#   3 - Find the original message
#   4 - index the list for the previous element
    # -------------------------------------------------------------------------
    messages = client.agents.threads.list_messages(thread_id)
    print("message length"+ str(len(messages)))
    for i, msg in enumerate(messages):

        if msg.id == message_id and i > 0:
            assistant_msg = messages[i - 1]
            print("\n=== Agent Response ==================================================")
            print(assistant_msg.content)
            print("====================================================================")
            break