Manage chunk metadata

User-defined metadata lets you attach structured business fields (such as year, doc_type, department, or is_confidential) to the chunks in a vector database, then retrieve and update those fields after ingestion. You can attach metadata when you ingest files (see Create and populate a vector database), then use the methods here to browse chunks, filter them by metadata, and edit metadata in place.

Metadata rules

The same rules apply wherever metadata is accepted (ingestion request, manual chunking blocks, and the edit method below):

Metadata is a flat JSON object. Nested objects and arrays of objects are not allowed.
Each value is a string, number, boolean, datetime, or null. A null value is treated as missing and dropped.
A metadata object has 20 or fewer keys.
Keys are non-empty strings, recommended to be 255 characters or fewer, and cannot start with _ (reserved for system fields).

A request that breaks these rules returns a 422 validation error describing the problem.

List and filter chunks

Use list_chunks to return a paginated list of chunks from a vector database. Filter by file, by specific chunk IDs, or by metadata fields. When you provide more than one filter, results must match all of them.

from seekrai import SeekrFlow

client = SeekrFlow()

result = client.vector_database.list_chunks(
    database_id="<database-id>",
    metadata={"year": {"$gte": 2023}, "doc_type": "meeting_notes"},
    limit=20,
    offset=0,
)

print(f"{result.total} matching chunks")
for chunk in result.data:
    print(chunk.chunk_id, chunk.metadata)

Parameters:

Parameter	Description
`database_id`	ID of the vector database.
`file_id`	Optional. Return only chunks from this file.
`chunk_ids`	Optional. Return only these chunk IDs.
`metadata`	Optional. Filter by metadata key-value pairs.
`limit`	Maximum results to return. Default 20, maximum 100.
`offset`	Pagination offset. Default 0.

Metadata filter operators: Match a value exactly by giving it directly ({"doc_type": "SOP"}), or use an operator object for comparisons:

Operator	Meaning
`$in`	Value is one of a list, for example `{"severity": {"$in": ["P0", "P1"]}}`.
`$gt`	Greater than.
`$gte`	Greater than or equal to.
`$lt`	Less than.
`$lte`	Less than or equal to.

Each chunk in the response includes its chunk_id, file_id, text, metadata, hierarchy, and locations.

Edit chunk metadata

Use update_metadata to overwrite the metadata on chunks in a vector database. Target the chunks with either chunk_ids or file_ids, but not both. Editing is scoped to the vector database, so updating a file’s chunks here does not affect chunks of the same file in another vector database.

# Update specific chunks
client.vector_database.update_metadata(
    database_id="<database-id>",
    chunk_ids=["chunk-1", "chunk-2"],
    metadata={"source": "manual_review", "category": "support_doc"},
)

# Update every chunk belonging to one or more files
client.vector_database.update_metadata(
    database_id="<database-id>",
    file_ids=["file-abc123"],
    metadata={"department": "finance", "year": 2024},
)

This operation is destructive. The metadata object you provide replaces all existing metadata on each targeted chunk, so include any fields you want to keep. To find the chunks and current metadata to edit, list them first with list_chunks.

Delete metadata

Metadata is removed automatically when its chunks are deleted. Deleting a file from a vector database removes that file’s chunks and their metadata. See Create and populate a vector database for file deletion.

​Metadata rules

​List and filter chunks

​Edit chunk metadata

​Delete metadata

Metadata rules

List and filter chunks

Edit chunk metadata

Delete metadata