Release Notes | August 2025

🟢

New Feature

GRPO Fine-Tuning

We’ve introduced Group Relative Policy Optimization (GRPO) Fine-Tuning in SeekrFlow, a powerful reinforcement learning technique that unlocks advanced reasoning capabilities in Large Language Models (LLMs). GRPO fine-tuning transforms models from passive information retrievers into active problem-solvers, capable of handling complex, verifiable tasks with greater precision and reliability.

This method is effective in domains where answers can be definitively validated, such as mathematics, coding, and other structured problem-solving scenarios. GRPO training follows a similar process to standard fine-tuning, with a few targeted modifications, and is now fully supported in SeekrFlow.

Read the blog here
View Documentation here