Fine-Tuning
An introduction to fine-tuning, its components, and various applications.
Understanding model specialization: A guide to fine-tuning with SeekrFlow
This guide explains how fine-tuning transforms generic AI models into specialized tools with domain expertise, the automated SeekrFlow process for creating accurate training datasets, and when to implement fine-tuning for your specific use cases.
How fine-tuning works
Fine-tuning simply means adapting a pre-trained, generalist model to be more effective at a given task; for example, developing a training resource for CBP officers that has deep knowledge of the operational strategies and organizational structures of international drug cartels. Generalist models are powerful but lack access to highly sensitive information (for good reason). They're also not directly optimized for factual accuracy. However, they can be fine-tuned, or taught to specialize in a certain task or domain, while retaining most of their original capabilities.
Preparing and creating these datasets manually is time-consuming and labor-intensive. Another crucial aspect of building specialized use cases is making sure the model adheres to your organization's specific guidelines, which can include regulatory compliance (e.g., HIPAA, GDPR), company policies, or ethical standards. SeekrFlow automates all of these key steps to create trustworthy, accurate data for instruction fine-tuning, including:
- File upload and processing
- Structured Markdown file generation
- Parquet dataset generation for supervised instruction fine-tuning
What is instruction fine-tuning and what are QA pairs?
Instruction fine-tuning trains a generalist model on a dataset comprised of high-quality question-and-answer (QA) pairs, often with the goal of creating a helpful assistant. This is the default fine-tuning method used by SeekrFlow.
To fine-tune, you'll use a structured file generated by the Data Engine that contains question-and-answer (Q&A) pairs.
The goal of instruction fine-tuning is to help update model parameters through a systematic training process, by providing structured examples that take the form of a sample input and a desired output, like this example generated from American Airlines policy data:
{
"finetuning_data_element_id": qa_1234,
"messages": [
{
"role": "system",
"content": "You are a helpful chatbot trained to assist American Airlines customers with questions about airline policies."
},
{
"role": "user",
"content": "What is your baggage policy?"
},
{
"role": "assistant",
"content": "For checked bags, American Airlines allows one free bag for most passengers, with size and weight limits, and charges fees for additional bags and overweight/oversized items. For carry-ons, you can bring one carry-on and one personal item, with size restrictions."
}
Each pair is run through the model and serves as a training signal to adjust model weights. This structured, lightweight approach leads to better model understanding - and better results.
Combined with the system prompt, the model can now respond in detail:
{
"user":"What is your baggage policy?",
"response": "Our baggage policy allows for a single bag or personal item to be brought on board for free. You can also purchase additional baggage allowances in advance for $4-$20 per bag, depending on the airport and the time of year.However, it's worth noting that some airports have specific restrictions on the number of bags that can be brought on board, and there may be additional fees for excess bags. It's always a good idea to check the airline's policies and the restrictions at the airport you're flying out of to avoid any issues. Additionally, if you're flying with a partner airline, you should contact them directly for more information on their baggage policies."
}
Compare the output against a sample response from a base model to see the value of this approach:
{
"user": "What is your baggage policy?",
"response":"As a responsible and ethical AI language model, I don't have a baggage policy as I am not a physical entity that can carry luggage or cargo. Additionally, I don't have the ability to interact with the physical world or transport objects. My purpose is to assist and provide helpful responses to users' questions and tasks, so please feel free to ask me anything!"
}
Context-grounded fine-tuning for dynamic information access
While standard fine-tuning embeds knowledge directly into model parameters, our contextual fine-tuning solution provides an alternative approach for handling frequently changing information. This method enables models to:
- Access external knowledge bases during inference
- Incorporate the most current information without retraining
- Ground responses in verified data sources
- Maintain accuracy as underlying documents evolve
Contextual fine-tuning works alongside traditional fine-tuning techniques, providing flexibility to balance embedded expertise with dynamic information retrieval for use cases requiring real-time data access.
When to use fine-tuning
Many developers will begin a project by prompting, prompt chaining, and/or function calling first. These can sometimes yield good results on their own, or be used in combination with fine-tuning to improve results later.
While these can be a great starting point, there are many use cases where the generalist model hasn't seen the data - perhaps it's proprietary, or sensitive information (the CBP officer assistant example from the top of this page comes to mind). Maybe the information you need just doesn't exist in the public domain, for whatever reason. Prompting a base model won't get you very far in any of those scenarios. You'll want to gather your own data and use it to fine-tune a model for best results.
Other ways fine-tuning can improve your results include:
- Update knowledge
- Improving output reliability
- Customize behavior for specific needs (e.g., brand voice) via setting the style, tone, or format
- Optimize for tasks that are more easily demonstrated than described (e.g., legal contract analysis, case law research, compliance)
- Reducing cost and latency over a more expensive model
- Edge case handling
And if you do want to experiment before fine-tuning...well, look no further:
- Prototype quickly with prompts in our Playground
- (If you haven't signed up yet, seeQuickstart for instructions)
- Build prompt chains and more complex logic on SeekrFlow using our LangChain integration package
- Create custom function calls to interface with your own code and external sources.
You can save your prompts for reuse as fine-tuning goals, or as fine-tuning data. Good prompts and (corrected) failed prompts can both become useful data for fine-tuning a model to a complex, nuanced task.
Updated 12 days ago