Release Notes | November 2024

November’s release introduces new features designed to give users greater control and efficiency over their AI workflows—from optimizing model outputs to managing large-scale deployments. For more details, read our full release blog.

🟢

New Features

Sandbox Input Parameters

We have added new input parameters to the Sandbox environment: Temperature, Top P, and Max Tokens. These options give more control over inference outputs, allowing users tailor responses for specific tasks and use cases.


Enhanced Inference Engine for Faster Inference

With the integration of vLLM, inference speeds have dramatically improved, making your AI workflows faster and more efficient. This integration ensures that even complex models deliver faster results, helping users achieve more in less time.

We conducted performance testing to compare TGI and vLLM on Intel Gaudi2 accelerators.

  • TGI: tgi-gaudi v2.0.4
  • vLLM: vllm-fork v0.5.3.post1-Gaudi-1.17.0

On average, the enhanced inference engine performed 32% faster when compared to TGI for the RAG experiments (full interaction traces, 10 concurrent requests) using Meta-Llama-3.1-8B-Instruct

Significant latency improvements in load testing with 100 concurrent users:

  • Meta-Llama-3-8B-Instruct: ~45% faster
  • Meta-Llama-3.1-8B-Instruct: ~39% faster

Enhanced OpenAI compatibility features

Our inference engine now seamlessly integrates with OpenAI’s ecosystem, expanding workflow capabilities and enhancing usability.

  • Log Probabilities: New support for log_probs and top_logprobs, providing insights into model decision-making, aiding debugging, and improving output accuracy.
  • Dynamic Tool Calling: Custom functions can now be automatically invoked by the model based on context, streamlining business logic integration.

Code Example

import os 
import openai 

client = openai.OpenAI( 
base_url="https://flow.seekr.com/v1/inference", 
api_key=os.environ.get("SEEKRFLOW_API_KEY") 

) 
response = client.chat.completions.create( 
model="meta-llama/Meta-Llama-3.1-70B-Instruct", 
stream=False, 
messages=[{ 
"role": "user", 
"content": "Convert from 5 kilometers to miles" 
}], 
max_tokens=100, 
tools=[{ 
"type": "function", 
"function": { 
"name": "convert_units", 
"description": "Convert between different units of measurement", 
"parameters": { 
"type": "object", 
"properties": { 
"value": {"type": "number"}, 
"from_unit": {"type": "string"}, 
"to_unit": {"type": "string"} 

}, 

"required": ["value", "from_unit", "to_unit"] 

} 
} 
}] 
) 
print(f"OpenAI Tool Call Response: {response}") 

Output Example

OpenAI Tool Call Response: 
id='chatcmpl-a2473e9a22144b64b8d41b24a16e7f81' 
object=<ObjectType.ChatCompletion: 'chat.completion'> 
created=1733428433 
model='meta-llama/Meta-Llama-3.1-70B-Instruct' 
choices=[ChatCompletionChoicesData(
    index=0, 
    finish_reason=<FinishReason.ToolCalls: 'tool_calls'>, 
    message=ChatCompletionMessage(
        role=<MessageRole.ASSISTANT: 'assistant'>, 
        content=None, 
        tool_calls=[**{
            'id': 'chatcmpl-tool-17c78078022c480cb211ae5640fa15af', 
            'type': 'function', 
            'function': {
                'name': 'convert_units', 
                'arguments': '{"value": "5", "from_unit": "kilometers", "to_unit": "miles"}'
            }
        }**]
    ), 
    logprobs=None, 
    stop_reason=128008
)] 
prompt=None 
usage=UsageData(
    prompt_tokens=220, 
    completion_tokens=34, 
    total_tokens=254
) 
prompt_logprobs=None

Federated Login with Intel

Intel® Tiber™ AI Cloud users can now access SeekrFlow with a new federated login feature

First-time users: Start by using your Intel® Tiber™ AI Cloud credentials, which will auto-populate the sign-up form for quick and easy access to SeekrFlow.

Returning users: Simply log in with your Intel® Tiber™ AI Cloud credentials for direct access to SeekrFlow.

This integration simplifies user management and access for those connected to Intel® Tiber™ AI Cloud.


🔵

Improvements & Bug Fixes

Streaming in Sandbox

We have enabled streaming for chat responses in the Sandbox, delivering results incrementally so users can utilize results without a delay.

Clear and Restart Button Fix

We have resolved an issue where the “Clear and Restart” button in Sandbox didn’t function. A dialog box now appears, confirming that chat history will be cleared, allowing you to start fresh.

Dataset Directory Update

Uploaded file improvements in the Create Run Modal

  • Successfully uploaded files immediately appear in the dataset directory.
  • Switching to the directory view auto-selects the newly uploaded file.
  • Radio buttons now remain active, ensuring smooth file selection.

🟣

UI/UX Enhancements

We have made several updates to improve the user experience and provide clearer guidance across the platform

  • Sandbox: Updated language to better support the new model parameter settings for improved clarity.
  • Deployment Dashboard: Enhanced explanations of cost transparency features for better understanding of resource usage.
  • Projects: Cancellation dialogs now show detailed cost information related to token usage, offering users more visibility into their resource consumption.

These updates aim to make SeekrFlow’s interface more intuitive and user-friendly, enhancing navigation and overall clarity.