FileSearch Tool
The filesearch tool allows you to retrieve data from a vector database
What is File Search
File Search, sometimes referred to as "Agentic RAG", allows you to embed and leverage your business documents in an agentic application. When appropriately invoked, the tool retrieves the most relevant document chunks to help your agent complete its task.
When should File Search be used?
Anytime you have a use case where you want to leverage your business documentation to complete a task (summarize documents, answer questions based on company knowledge, or generate reports using internal data), we recommend configuring a File Search tool.
How to create a File Search Tool
To create a file search tool, you must first create a vector database, then embed your business documents. The following steps take you through that process
Create and Populate a Vector Database
To create and populate a vector database with your business documentation, view the associated documentation here.
Create an Agent with File Search
from seekrai import SeekrFlow
from seekrai.types import CreateAgentRequest, FileSearch, FileSearchEnv
client = SeekrFlow()
database_id = "your_database_id"
# Create an agent with FileSearch capability
agent = client.agents.create(
CreateAgentRequest(
name="DocBot",
instructions="You are DocBot, an expert assistant that can search through company documents to answer questions. Always cite the specific documents you reference. Respond only with data returned from the file_search tool.",
model_id="meta-llama/Llama-3.1-8B-Instruct", # or your choice of deployed model
tools=[FileSearch(
tool_env=FileSearchEnv(
file_search_index=database_id,
document_tool_desc="Search through company documents, policies, procedures, and knowledge base articles.",
top_k=5,
score_threshold=0.7,
)
)],
))
# Retrieve the agent to get its details
agent_info = client.agents.retrieve(agent_id=agent.id)
# Print the agent ID and status
print(f"Agent ID: {agent.id}")
print(f"Agent Status: {agent_info.status}")
FileSearchEnv Parameters
The file search Tool class, FileSearchEnv
, has four parameters. Each is described below:
file_search_index
ID of your vector database containing the documentsdocument_tool_desc
Description helping the agent understand when to use this tool
Tool Description best practices:
- **Write clear, concise descriptions **that specify when the tool should be invoked (e.g., "Use this tool to search internal company policies when a user asks about HR procedures.")
- **Include example queries or scenarios **to help the agent understand the tool's intended use
- Clearly define the scope and limitations of the tool (e.g., "This tool only searches technical documentation, not customer support tickets.")
top_k
Number of results to return from the search
Top K best practices:
- Higher top_k values increase the likelihood of including relevant results but may introduce more noise—use for exploratory queries or when context is broad
- Lower top_k values give more focused results but risk missing relevant information—use for targeted tasks
- Common settings are between 3 and 10
score_threshold
Minimum hybrid search threshold for a result to be returned
Use score_threshold to filter out weak or irrelevant matches, improving the overall quality of search results
Start with a moderate threshold (e.g., 0.5–0.7 for cosine similarity) and adjust based on observed retrieval quality and user satisfaction
Updated 1 day ago