File search

Retrieve relevant data from a vector database using semantic search.

File search, sometimes referred to as agentic RAG, gives your agents access to your business documents within an agentic application. When invoked, the tool retrieves the most relevant document chunks to help your agent complete its task.

When to use file search

Anytime you have a use case where you want to leverage your business documentation to complete a task — summarize documents, answer questions based on company knowledge, or generate reports using internal data — configure a file search tool.

Prerequisites

File search requires a populated vector database. To create and populate one, see Create and populate a vector database.

Create a file search tool

from seekrai import SeekrFlow
from seekrai.types import CreateFileSearch, FileSearchConfig

client = SeekrFlow()

file_search_tool = client.tools.create(
    CreateFileSearch(
        name="doc_search",
        description="Search through company documents, policies, procedures, and knowledge base articles.",
        config=FileSearchConfig(
            file_search_index="<vector-store-id>",
            top_k=5,
            score_threshold=0.7
        )
    )
)
print(f"Tool created: {file_search_tool.id}")

Parameters

ParameterRequiredDescription
nameYesA unique name for the tool.
descriptionYesDescription that helps the agent understand when to use this tool.
file_search_indexYesID of your vector database containing the documents.
top_kNoMaximum number of chunks to return from the search.
score_thresholdNoMinimum similarity score for a chunk to be included in results.

Link to an agent

from seekrai.types import CreateAgentRequest

agent = client.agents.create(
    CreateAgentRequest(
        name="DocBot",
        instructions="You are DocBot, an expert assistant that can search through company documents to answer questions. Always cite the specific documents you reference. Respond only with data returned from the file search tool.",
        model_id="meta-llama/Llama-3.3-70B-Instruct",
        tool_ids=[file_search_tool.id]
    )
)
print(f"Agent ID: {agent.id}")
print(f"Agent status: {agent.status}")

Best practices

Tool description

  • Write clear, concise descriptions that specify when the tool should be invoked — for example, "Use this tool to search internal company policies when a user asks about HR procedures."
  • Include example queries or scenarios to help the agent understand the tool's intended use.
  • Clearly define the scope and limitations of the tool — for example, "This tool only searches technical documentation, not customer support tickets."

top_k

  • Higher values increase the likelihood of including relevant results but may introduce more noise. Use for exploratory queries or when context is broad.
  • Lower values give more focused results but risk missing relevant information. Use for targeted tasks.
  • Common settings are between 3 and 10.

score_threshold

  • Use score_threshold to filter out weak or irrelevant matches, improving overall result quality.
  • Start with a moderate threshold (such as 0.5–0.7) and adjust based on observed retrieval quality.