Standard Instruction Fine-Tuning Data

Get higher-quality results with curated question-answer examples that allow the model to develop more accurate responses within specific domains or tasks.

This guide walks you through how to prepare and upload files to generate a standard instruction fine-tuning dataset that you can use to adapt a generalist model to a task of your choosing.

How it Works

Fine-tuning simply means adapting a pre-trained, generalist model to be more effective at a given task; for example, developing a training resource for CBP officers that has deep knowledge of the operational strategies and organizational structures of international drug cartels. Generalist models are powerful but lack access to highly sensitive information (for good reason). They're also not directly optimized for factual accuracy. However, they can be fine-tuned, or taught to specialize in a certain task or domain, while retaining most of their original capabilities.

Preparing and creating these datasets manually is time-consuming and labor-intensive. Another crucial aspect of building specialized use cases is making sure the model adheres to your organization's specific guidelines, which can include regulatory compliance (e.g., HIPAA, GDPR), company policies, or ethical standards. SeekrFlow automates all of these key steps to create trustworthy, accurate data for instruction fine-tuning, including:

  1. File Upload and Processing
  2. QA Pair Generation
  3. Parquet Dataset Generation for Instruction Fine-Tuning

What is Instruction Fine-Tuning and QA Pairs?

Instruction fine-tuning trains a generalist model on a dataset comprised of high-quality question-and-answer (QA) pairs, often with the goal of creating a helpful assistant. This is the default fine-tuning method used by SeekrFlow.

A Parquet QA Pair File is a structured dataset containing question-and-answer (Q&A) pairs, stored in .parquet format for optimized data storage.

The goal of instruction fine-tuning with QA pairs is to help update model parameters through a systematic training process, by providing structured examples that take the form of a sample input and a desired output, like this example from a QA pair file generated from American Airlines policy data:

{
  "finetuning_data_element_id": qa_1234,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful chatbot trained to assist American Airlines customers with questions about airline policies."
    },
    {
      "role": "user",
      "content": "What is your baggage policy?"
    },
    {
      "role": "assistant",
      "content": "For checked bags, American Airlines allows one free bag for most passengers, with size and weight limits, and charges fees for additional bags and overweight/oversized items. For carry-ons, you can bring one carry-on and one personal item, with size restrictions."
    }

Each pair is run through the model and serves as a training signal to adjust model weights. This structured, lightweight approach leads to better model understanding - and better results.

Combined with the system prompt, the model can now respond in detail:

{
 "user":"What is your baggage policy?",
 "response": "Our baggage policy allows for a single bag or personal item to be brought on board for free. You can also purchase additional baggage allowances in advance for $4-$20 per bag, depending on the airport and the time of year.However, it's worth noting that some airports have specific restrictions on the number of bags that can be brought on board, and there may be additional fees for excess bags. It's always a good idea to check the airline's policies and the restrictions at the airport you're flying out of to avoid any issues. Additionally, if you're flying with a partner airline, you should contact them directly for more information on their baggage policies."
}

Compare the output against a sample response from a base model to see the value of this approach:

{
 "user": "What is your baggage policy?",
 "response":"As a responsible and ethical AI language model, I don't have a baggage policy as I am not a physical entity that can carry luggage or cargo. Additionally, I don't have the ability to interact with the physical world or transport objects. My purpose is to assist and provide helpful responses to users' questions and tasks, so please feel free to ask me anything!"
}


When to Use Fine-tuning

Many developers will begin a project by prompting, prompt chaining, and/or function calling first. These can sometimes yield good results on their own, or be used in combination with fine-tuning to improve results later.

While these can be a great starting point, there are many use cases where the generalist model hasn't seen the data - perhaps it's proprietary, or sensitive information (the drug cartel assistant example from the top of this page comes to mind). Maybe the information you need just doesn't exist in the public domain, for whatever reason. Prompting a base model won't get you very far in any of those scenarios. You'll want to gather your own data and use it to fine-tune a model for best results.

Other ways fine-tuning can improve your results include:

  • Improving output reliability
  • Setting the style, tone, or format
  • Performing a new task that’s more easily demonstrated than described
  • Reducing cost and latency over a more expensive model
  • Edge case handling

And if you do want to experiment before fine-tuning...well, look no further:

  • Prototype quickly with prompts in our Playground
    • (If you haven't signed up yet, see Quickstart for instructions)
  • Build prompt chains and more complex logic on SeekrFlow using our LangChain integration package
  • Create custom function calls to interface with your own code and external sources.

You can save your prompts for reuse as fine-tuning goals, or as fine-tuning data. Good prompts and (corrected) failed prompts can both become useful data for fine-tuning a model to a complex, nuanced task.


Fully Automated Data Preparation and Creation with the Data Engine

Traditional fine-tuning dataset creation involves a lot of manual labor: Data collection, cleaning, labeling, format conversion, and so forth. This is where SeekrFlow's AI-Ready Data Engine shines. It can process files of various formats (PDF, Markdown, JSON, DOCX) and convert them into a file ready for fine-tuning quickly and painlessly, in a matter of minutes. Read on to find out how to generate your own dataset for standard instruction fine-tuning.

Step 1: Generating Fine-Tuning Data

Endpoint: POST /v1/flow/alignment/generate Generate IFT pairs

Using your properly formatted file, you can start the data generation process to create document chunks.

Note: See the previous page, Data Preparation, for complete instructions on how to process and convert all of your files before starting this step.

Collect Your Markdown File ID
Use the ID of the final Markdown file from either Step 2 or the ingestion output in Step 3.

import requests

url = "https://flow.seekr.com/v1/flow/alignment/generate"

payload = {
    "files": [
        "file-123456789"
    ],
    "instructions": "I want to train a chatbot to answer questions about international drug cartels.",
    "purpose": "alignment"
}

response = requests.post(url, json=payload, headers=headers)

print("Status Code:", response.status_code)
print("Response:", response.json())

Sample Response:

{
    "id": "aj-123456789",
    "created_at": "example_timestamp",
    "status": "queued"
}

Step 2: Checking Job Status (Optional)

Endpoint: GET /v1/flow/alignment/{job_id} Fetch Job Details

While the alignment job is running, you can poll its status to see when it is complete.

import requests

job_id = "aj-123456789"
url = "https://flow.seekr.com/v1/flow/alignment/{job_id}"

response = requests.get(url, headers=headers)
print("Status Code:", response.status_code)
print("Response:", response.json())

Sample Response:

{
  "id": "aj-123456789",
  "created_at": "example_timestamp",
  "status": "completed"
}

Once the status shows completed, the Parquet file will appear in your files list (e.g., filename": "file-d7166390-962f-4d22-93a2-265d93c114e6-raft-qa-pairs.parquet).


The stacked approach: Fine-Tuning + Context-Grounded Fine-Tuning

Currently, SeekrFlow offers Supervised Fine-Tuning (SFT) as the default method. Performing SFT before running a context-grounded fine-tuning job can significantly boost model performance. The first fine-tuning run can identify more correct patterns, setting up the subsequent context-grounded fine-tuning job for success.

Recommended Workflow:

  1. Fine-tune the base model using the QA pair file generated from your source documents. Focus on data quality and task relevance.
  2. With the fine-tuned model as the starting point, generate context-grounded data and run a second job using context-grounded data to adjust the model for best results.