What is Principle Alignment?

What is Principle Alignment?

Introduction

Principle alignment is Seekr's technology of automatically aligning a base generalist foundation model to the principles of a specific domain, with minimal human intervention. The result is a fine-tuned specialist model that adheres to these principles.

Automated alignment

A SeekrFlow user is required to provide a set of high-level principles that they expect the model to adhere to, or specific policy documents/guidelines they need the model to operate in. (Here, we will call all these principles)

After the principles have been supplied, an alignment workflow will automatically begin. The workflow will start with ingesting and structuring the principles (e.g. in a hierarchy, a knowledge graph, a vector index). It will then enter a data creation loop, where the principles and their boundaries are explored.

This process will create the data to fine-tune a model to adhere to the principles, thus allowing us to obtain a specialist (or domain expert) model with minimal human intervention.

The diagram below depicts this process.

Automatic Data Generation with Principle Alignment

Importantly, this process removes the often laborious and expensive task of manually collecting, curating, and annotating data.

Agentic workflows

A key feature of principle alignment and its ability to automatically create data for a specialist model is the use of agentic tool-based workflows.

These workflows involve one or more agents and a base/generalist LLM that may initially not have knowledge of the principles.

In addition to the structured principles, the agent has access to external tools such as web search APIs, knowledge graphs, calculators, code interpreters etc.

The core idea here is that given a task definition, the agent the agent will draft out a "plan" of what steps it needs to take to solve the problem and in what order. The tools allow the agent to iteratively research, generate, critique, and refine the understanding of the principles. thus allowing the creation of high-quality domain-specific data for the specialist model to be fine-tuned.

Human-in-the-loop

Another key feature of principle alignment is the ability to interrupt/intervene into the process of data creation by bringing in a human-in-the-loop.

For example, subject matter experts can iteratively provide feedback on the model's understanding of the principles, in the following manner:

  1. Review the generated synthetic data
  2. Trace the most influential portions of text in individual questions or the document's intermediate
    graphical representation that led to specific answers
  3. Intervene and edit question-level or document-level text for clarity and accuracy
  4. Regenerate synthetic data based on their edits from our agentic system

This gives SeekrFlow users the ability to iteratively and optimally intervene and refine the synthetic data generation to ensure proper representation.

Structuring with domain principles

This section provides guidance on (1) converting a markdown (.md) document into a tree structure representation (.json) of the document, and (2) submitting a job in which the SeekrFlow Principle Alignment system uses the tree structure file to generate synthetic data. This data is used in a later stage of SeekrFlow for training a model that is aligned with the principles and knowledge contained in the document.

In order to submit a principle alignment job, the source data must first be converted to a JSON tree structure file. Given a markdown file (example .md file), this script can be used to convert the document into a tree structure, using the following command:

python convert_md_to_json.py file_in.md -o file_out.json

The output of this script is a .json file; see here for an example.

Now that the JSON file is ready, we use the SeekrFlow client to upload the file and kick off a principle alignment job. We specify an instruction as part of the client.alignment.generate() call to generate a model (e.g. chatbot) that specializes in the domain knowlege of our source document; this statement of our end goal is used to generate synthetic data, to be used in the subsequent model training step. See below for an example of this usage of the client:

import os
from seekrai import SeekrFlow

client = SeekrFlow(api_key=os.environ.get("SEEKR_API_KEY"))

# Upload the alignment .json graph tree generated from the .md file
file = client.files.upload("out.json", purpose="alignment")

# Start the alignment job to generate synthetic data using the alignment file
alignment_job = client.alignment.generate(instructions="I want to create a chatbot that specializes in domain X.", files=[file.id])

We can list all alignment jobs that were previously created, or get information about a particular alignment job, after which we can retrieve/observe the generated alignment file to be used for fine-tuning, or download it for editing and then re-upload it for fine-tuning:

# list all alignment jobs
client.alignment.list()

# retrieve specific alignment job to observe its status
client.alignment.retrieve(alignment_job.id)

# if alignment job has completed:
# will show ID of the recently generated .parquet file, e.g. file-123
client.files.list()

# we can use this file for fine-tuning, and we can also can download said file
client.files.retrieve("file-123")

# new local .parquet file; can inspect/edit it locally and re-upload it
new_file = client.files.upload(my_edited_file.parquet)