File search
Retrieve relevant data from a vector database using semantic search.
File search, sometimes referred to as agentic RAG, gives your agents access to your business documents within an agentic application. When invoked, the tool retrieves the most relevant document chunks to help your agent complete its task.
When to use file search
Anytime you have a use case where you want to leverage your business documentation to complete a task — summarize documents, answer questions based on company knowledge, or generate reports using internal data — configure a file search tool.
Prerequisites
File search requires a populated vector database. To create and populate one, see Create and populate a vector database.
Create a file search tool
from seekrai import SeekrFlow
from seekrai.types import CreateFileSearch, FileSearchConfig
client = SeekrFlow()
file_search_tool = client.tools.create(
CreateFileSearch(
name="doc_search",
description="Search through company documents, policies, procedures, and knowledge base articles.",
config=FileSearchConfig(
file_search_index="<vector-store-id>",
top_k=5,
score_threshold=0.7
)
)
)
print(f"Tool created: {file_search_tool.id}")Parameters
| Parameter | Required | Description |
|---|---|---|
name | Yes | A unique name for the tool. |
description | Yes | Description that helps the agent understand when to use this tool. |
file_search_index | Yes | ID of your vector database containing the documents. |
top_k | No | Maximum number of chunks to return from the search. |
score_threshold | No | Minimum similarity score for a chunk to be included in results. |
Link to an agent
from seekrai.types import CreateAgentRequest
agent = client.agents.create(
CreateAgentRequest(
name="DocBot",
instructions="You are DocBot, an expert assistant that can search through company documents to answer questions. Always cite the specific documents you reference. Respond only with data returned from the file search tool.",
model_id="meta-llama/Llama-3.3-70B-Instruct",
tool_ids=[file_search_tool.id]
)
)
print(f"Agent ID: {agent.id}")
print(f"Agent status: {agent.status}")Best practices
Tool description
- Write clear, concise descriptions that specify when the tool should be invoked — for example, "Use this tool to search internal company policies when a user asks about HR procedures."
- Include example queries or scenarios to help the agent understand the tool's intended use.
- Clearly define the scope and limitations of the tool — for example, "This tool only searches technical documentation, not customer support tickets."
top_k
- Higher values increase the likelihood of including relevant results but may introduce more noise. Use for exploratory queries or when context is broad.
- Lower values give more focused results but risk missing relevant information. Use for targeted tasks.
- Common settings are between 3 and 10.
score_threshold
- Use
score_thresholdto filter out weak or irrelevant matches, improving overall result quality. - Start with a moderate threshold (such as 0.5–0.7) and adjust based on observed retrieval quality.
Updated about 1 month ago
