Step 1: Set up a vector database
Seekr’s Vector Database SDK provides advanced semantic search capabilities by transforming text into vector embeddings, making it possible to perform semantic searches that focus on meaning and context. This approach provides a smarter and more intuitive way to retrieve documents compared to traditional keyword-based methods. Start by creating a new vector database, specifying the embedding model: Supported embedding models| Model Name | Dimensions | Best Used For |
|---|---|---|
intfloat/e5-mistral-7b-instruct | 4096 | This model has some multilingual capability. However, since it was mainly trained on English data, we recommend using this model for English only. |
bedrock:amazon.titan-embed-text-v2:0 | 256, 512, or 1024 | Recommended Bedrock model. Best for RAG, document search, and reranking. Supports 100+ languages. Self-hosted AWS/EKS only — see Use AWS Bedrock for ingestion and inference. |
bedrock:amazon.titan-embed-text-v1 | 1536 | Legacy Bedrock model. Text retrieval and semantic similarity. Supports 25+ languages. Self-hosted AWS/EKS only — see Use AWS Bedrock for ingestion and inference. |
bedrock:amazon.titan-embed-g1-text-02 | 1536 | Legacy Bedrock G1 model. Text retrieval and semantic similarity. Supports 25+ languages. Self-hosted AWS/EKS only — see Use AWS Bedrock for ingestion and inference. |
bedrock:amazon.titan-embed-image-v1 | 256, 384, or 1024 | Multimodal (text + image) embeddings. Self-hosted AWS/EKS only — see Use AWS Bedrock for ingestion and inference. |
Create an empty vector database
Start by creating a new vector database with specified embedding model.Step 2: Upload files
Supported file types- PDF (
.pdf) - Word documents (
.docx) - Markdown (
.md)
- All files must have correctly ordered headers (
#followed by##, and so on) with titles and meaningful content. For example:
- Avoid using headers with more than 6 hashtags (e.g.,
####### Pointlessly small md header)
Upload a file for ingestion
Next, upload your files for processing.file_id to use for ingestion.
Upload a batch of files for ingestion (optional)
The endpoint accepts an array offile_ids as input.
Step 3: Start a vector database ingestion job
Next, create a job to ingest documents into your vector database. This step converts the files and creates embeddings from them. Thetoken_count parameter specifies the target size of each chunk, ensuring each chunk is neither too large (risking truncation by model limits) nor too small (losing semantic coherence).
Best practices:
- Common ranges: For embedding and retrieval, 200–500 tokens per chunk is a widely used range, balancing context and efficiency. The example here uses a token count of 512.
- Adjust for document type: If your documents are dense or have complex structure (e.g., legal, technical), consider slightly larger chunks; for conversational or highly variable content, smaller chunks may work better.
overlap_tokens parameter creates overlapping regions between adjacent chunks at chunk boundaries, reducing the risk of missing relevant information that spans two chunks.
Adjust chunking parameters based on document characteristics:
| Document Type | Recommended token_count | Recommended overlap_tokens |
|---|---|---|
| Technical documentation | 384-512 | 50-75 |
| Legal documents | 512-768 | 75-100 |
| Conversational content | 256-384 | 25-50 |
Attach metadata at ingestion
To attach user-defined metadata to the chunks created by an ingestion job, include an optionalmetadata object in the request. The metadata is job-level: it is copied onto every chunk produced from every file in the job. You can later filter or edit it with the chunk metadata methods (see Manage chunk metadata).
Ingestion mode and the UIWhen ingesting files through the SeekrFlow UI, speed-optimized mode is always used. The SDK lets you choose between speed-optimized and accuracy-optimized.
Choose an ingestion method
Accuracy-optimized (default) When you usemethod="accuracy-optimized" or omit the method parameter, the system prioritizes accuracy. Depending on what data is available in your PDF document (bookmarks, tables, text layers), the system combines multiple extraction techniques for best results.
Key features:
- Uses both OCR and direct text extraction, then blends them together
- Employs LLM agents to correct and enhance document hierarchy
- Applies advanced table detection algorithms for accurate table formatting
Documents over 100 pages can take up to 30 minutes to process.
method="speed-optimized", the system balances quality with processing speed. It automatically selects faster methods based on document size while maintaining reasonable accuracy for smaller documents.
Key features:
- Small documents still use high-accuracy methods
- Larger documents use speed optimized algorithms to meet time constraints
Optimized to complete in approximately 3 minutes regardless of document size.
- Respects document structure and header hierarchies
- Keeps related content together (headers with their content)
- Preserves tables with their headers
- Groups small sections to optimize chunk sizes
chunking_method="semantic" to enable meaning-aware segmentation powered by LLM similarity scoring. Meaning-aware segmentation that relies on LLM similarity scoring to decide where chunks begin and end:
- Searches for topic shifts instead of raw heading boundaries to keep tightly related sentences together
- Automatically merges short paragraphs or bullets when they express the same idea
- Honors document structure when it provides strong signals but can span across headings if the semantics match
- Applies Chunk Size and Chunk Overlap as safety caps, splitting only when the semantic chunk would exceed those limits
- Long narrative content (wikis, blogs, requirements) where sections do not follow strict Markdown hierarchy
- Mixed-format documents where context spans multiple small headers or callouts
- Insert
---DOCUMENT_BREAK---markers to define exactly where chunks should be separated - Chunk Size (token count) - Maximum tokens per chunk (used as fallback when content exceeds limit)
- Chunk Overlap (token count) - Overlap between chunks when sliding window is applied
---DOCUMENT_BREAK--- to specifiy where the chunks should be separated. Document break markers only work when using Manual Window Chunking mode. When using Markdown Chunking (default), these break markers are ignored as the system uses document structure for segmentation instead.
To use custom segmentation in your documents:
This ensures precise control over how your data is broken down before embedding.
Add per-chunk metadata
With manual window chunking, you can attach different metadata to different chunks by embedding a metadata block inside a section. Place the block between---CHUNK_META_START--- and ---CHUNK_META_END---, with a single line of JSON in between. The block is scoped to the section it appears in and is removed from the text before indexing, so it never becomes part of the chunk content.
- It is supported only with manual window chunking. Including these markers with any other chunking method is an error.
- Only one metadata block is allowed per section. A second block in the same section is an error.
- A section’s block replaces the job-level metadata for that chunk. The two are not merged.
- A section with no block inherits the job-level metadata from the ingestion request, or no metadata if the request supplied none.
Embedding model configuration
Supported embedding models The system uses the following embedding models for vector generation:| Model | Max Token Length | Estimated Max Words |
|---|---|---|
| E5-Mistral-7B-Instruct | 4096 | ~3,040 |
| Titan Text Embeddings V2 (Bedrock) | 8192 | ~6,080 |
| Titan Text Embeddings V1 (Bedrock) | 8192 | ~6,080 |
| Titan Text Embeddings G1 (Bedrock) | 8192 | ~6,080 |
| Titan Multimodal Embeddings (Bedrock) | 128 | ~96 |
Bedrock embedding models are available for self-hosted AWS/EKS deployments only. See Use AWS Bedrock for ingestion and inference for setup instructions.
- Each model has a maximum sequence length as shown in the table above
- Using inputs longer than the model’s max token length is not recommended
- Check model specifications for language support and optimal use cases
Step 4: Monitor ingestion status (optional)
After starting an ingestion job, you can track job progress, view per-file statuses, and diagnose any failures. See Monitor ingestion for details on checking job states, interpretingfile_records, and resolving errors.
Once status shows completed, your vector database is ready to query.
Source tracing
When a file is ingested into a vector database, the pipeline captures provenance metadata for every chunk — no additional configuration required. Each chunk stores:line_number_start/line_number_end— line range within the ingested Markdownchar_start/char_end— character offsets within the sectionhierarchy— heading path from the document root to the chunk (e.g.["Chapter 3", "3.2 Pricing", "Cancellation"])page_number— source document page number, 1-indexed (nullfor native Markdown or JSON uploads)
Line numbers refer to lines in the ingested Markdown, not the original file. Use page_number to navigate back to a source PDF.
Complete example
This example demonstrates the entire workflow for creating a vector database, adding files, and kicking off an ingestion job:Manage your vector databases
List all vector databases
Get a specific vector database
Delete a vector database
Delete a file from a vector database
Troubleshoot common issues
For ingestion-specific error codes with plain-language messages and suggested fixes, see Monitor ingestion.Document processing issues
| Issue | Possible cause | Solution |
|---|---|---|
| Files fail to upload | File exceeds size limit | Split large files or compress them |
| Invalid file format | Ensure file extension matches actual format | |
| Network timeout | Implement retry logic with exponential backoff | |
| Markdown parsing errors | Improper header hierarchy | Fix header structure (ensure proper nesting) |
| Unsupported Markdown syntax | Use standard Markdown formatting | |
| PDF extraction issues | Protected PDF | Remove password protection before uploading |
File ingestion issues
| Issue | Possible cause | Solution |
|---|---|---|
| Slow ingestion | Complex document structure | Adjust chunking parameters |
| Resource constraints | Monitor system resources during ingestion | |
| Large batch size | Break into smaller batches | |
| Failed ingestion job | Malformed content | Check files for compatibility issues |
| Service timeout | Increase timeout settings |