User-defined metadata lets you attach structured business fields (such as year, doc_type, department, or is_confidential) to the chunks in a vector database, then retrieve and update those fields after ingestion. You can attach metadata when you ingest files (see Create and populate a vector database), then use the methods here to browse chunks, filter them by metadata, and edit metadata in place.
The same rules apply wherever metadata is accepted (ingestion request, manual chunking blocks, and the edit method below):
- Metadata is a flat JSON object. Nested objects and arrays of objects are not allowed.
- Each value is a string, number, boolean, datetime, or
null. A null value is treated as missing and dropped.
- A metadata object has 20 or fewer keys.
- Keys are non-empty strings, recommended to be 255 characters or fewer, and cannot start with
_ (reserved for system fields).
A request that breaks these rules returns a 422 validation error describing the problem.
List and filter chunks
Use list_chunks to return a paginated list of chunks from a vector database. Filter by file, by specific chunk IDs, or by metadata fields. When you provide more than one filter, results must match all of them.
from seekrai import SeekrFlow
client = SeekrFlow()
result = client.vector_database.list_chunks(
database_id="<database-id>",
metadata={"year": {"$gte": 2023}, "doc_type": "meeting_notes"},
limit=20,
offset=0,
)
print(f"{result.total} matching chunks")
for chunk in result.data:
print(chunk.chunk_id, chunk.metadata)
Parameters:
| Parameter | Description |
|---|
database_id | ID of the vector database. |
file_id | Optional. Return only chunks from this file. |
chunk_ids | Optional. Return only these chunk IDs. |
metadata | Optional. Filter by metadata key-value pairs. |
limit | Maximum results to return. Default 20, maximum 100. |
offset | Pagination offset. Default 0. |
Metadata filter operators:
Match a value exactly by giving it directly ({"doc_type": "SOP"}), or use an operator object for comparisons:
| Operator | Meaning |
|---|
$in | Value is one of a list, for example {"severity": {"$in": ["P0", "P1"]}}. |
$gt | Greater than. |
$gte | Greater than or equal to. |
$lt | Less than. |
$lte | Less than or equal to. |
Each chunk in the response includes its chunk_id, file_id, text, metadata, hierarchy, and locations.
Use update_metadata to overwrite the metadata on chunks in a vector database. Target the chunks with either chunk_ids or file_ids, but not both. Editing is scoped to the vector database, so updating a file’s chunks here does not affect chunks of the same file in another vector database.
# Update specific chunks
client.vector_database.update_metadata(
database_id="<database-id>",
chunk_ids=["chunk-1", "chunk-2"],
metadata={"source": "manual_review", "category": "support_doc"},
)
# Update every chunk belonging to one or more files
client.vector_database.update_metadata(
database_id="<database-id>",
file_ids=["file-abc123"],
metadata={"department": "finance", "year": 2024},
)
This operation is destructive. The metadata object you provide replaces all existing metadata on each targeted chunk, so include any fields you want to keep. To find the chunks and current metadata to edit, list them first with list_chunks.
Metadata is removed automatically when its chunks are deleted. Deleting a file from a vector database removes that file’s chunks and their metadata. See Create and populate a vector database for file deletion.