Microsoft Tools Fine-Tune AI Agent Performance Based on User-Defined Criteria

Microsoft has introduced optimization features that evaluate AI agents hosted in its Foundry service to maximize their performance in production applications. It does so using automated, rather than the current manual, processes.

Agents built on Microsoft Foundry don’t automatically meet the production-ready requirements of the typical IT organization. Common enterprise shortcomings include agents answering warranty questions without checking purchase dates, or customer support agents failing to ask for order numbers before looking up an order’s status.

With Agent Optimizer in Foundry Agent Service, agents are evaluated against defined criteria, better configurations are generated, and optimizer ranks the configurations so users can deploy the best one.

Here’s how Agent Optimizer works through a closed-loop cycle of evaluation and improvement:

Evaluates baseline. An agent processes a set of tasks against defined pass/fail criteria. The result is a composite score from 0 to 1.
Generates candidates. Guided by what failed, the optimizer creates new configurations specifically designed to overcome failures.
Evaluate candidate options. Each candidate runs against the same set of tasks.
Ranks and recommends. Results are sorted by score. Users and developers see per-task breakdowns – as well as token costs — for each candidate before the user commits to the best option based on requirements.
Deploys winner. One command promotes the winning configuration to the live agent

Agent Optimizer has multiple targeting options for generating candidates under the process detailed above:

“Instruction” is the default target. Optimizer analyzes where an agent’s responses fall short, then generates alternative system prompts that address those gaps.

“Skill” generates reusable procedures — escalation processes, troubleshooting sequences, formatting templates — that are appended to the agent’s instructions.

“Model” enables the optimizer to evaluate an agent across multiple models. The optimizer scores each option against established evaluation criteria and shows which one produces optimal responses. Users pick the model based on what performs best against those criteria.

“Tool Descriptions” let the optimizer improve how an agent understands and uses its function tools. It rewrites tool descriptions and parameter definitions so that the agent reliably picks the best tool.

In an example published online, Agent Optimizer was used to improve the quality and performance of a customer support agent. It was able to use synthetic data or historical traces of how the agent performed and identify where it fell short. It rewrote the functionality to strengthen return policies, escalation procedures, troubleshooting frameworks, and safety boundaries. The changes that were ultimately made were scored against the criteria defined by the developer.

The agent optimization process runs in the cloud, typically taking a few minutes. Customers that already have a hosted agent deployed can run Agent Optimizer, which requires no model retraining, no code changes, and no additional infrastructure.

In the earliest days of experimenting with AI agents, customers may have been able able to accept trial-and-error heavy processes as part of the new-technology learning curve. But as agents take on more critical business functions, and financial leaders develop stronger ROI expectations, tech leaders need tools and practices to bring enterprise rigor to their agents and the results they produce. Agent Optimizer is a positive step toward that objective.

More AI Agent Insights:

Ask Cloud Wars AI Agent about this analysis

Microsoft Tools Fine-Tune AI Agent Performance Based on User-Defined Criteria

Tom Smith

Areas of Expertise

Google Cloud Q2 Revenue Soars 82%, Backlog Jumps 390%

Copilot in SharePoint Advances With New Content Creation and Trust Features

Google Cloud Q2: Revenue +82%, $24.8B, Backlog +390%, $514B

AWS Security Platform Extended to Protect AI and Azure Cloud Workloads

Accounts Payable Reimagined: ERP-Native Automation in Dynamics 365

elevaite365 Test Automation: Turning Software Testing into a Strategic Asset with AI

Driving Business Transformation with Agentic AI and ServiceNow

The Agentic Enterprise: How Microsoft and Industry Leaders Are Redefining Work Through AI

Microsoft Tools Fine-Tune AI Agent Performance Based on User-Defined Criteria

Interested in Microsoft?

Book a Demo

Tom Smith

Areas of Expertise

Related Posts