
Microsoft has introduced optimization features that evaluate AI agents hosted in its Foundry service to maximize their performance in production applications. It does so using automated, rather than the current manual, processes.
Agents built on Microsoft Foundry don’t automatically meet the production-ready requirements of the typical IT organization. Common enterprise shortcomings include agents answering warranty questions without checking purchase dates, or customer support agents failing to ask for order numbers before looking up an order’s status.
With Agent Optimizer in Foundry Agent Service, agents are evaluated against defined criteria, better configurations are generated, and optimizer ranks the configurations so users can deploy the best one.
Here’s how Agent Optimizer works through a closed-loop cycle of evaluation and improvement:
- Evaluates baseline. An agent processes a set of tasks against defined pass/fail criteria. The result is a composite score from 0 to 1.
- Generates candidates. Guided by what failed, the optimizer creates new configurations specifically designed to overcome failures.
- Evaluate candidate options. Each candidate runs against the same set of tasks.
- Ranks and recommends. Results are sorted by score. Users and developers see per-task breakdowns – as well as token costs — for each candidate before the user commits to the best option based on requirements.
- Deploys winner. One command promotes the winning configuration to the live agent
Agent Optimizer has multiple targeting options for generating candidates under the process detailed above:
“Instruction” is the default target. Optimizer analyzes where an agent’s responses fall short, then generates alternative system prompts that address those gaps.
“Skill” generates reusable procedures — escalation processes, troubleshooting sequences, formatting templates — that are appended to the agent’s instructions.
“Model” enables the optimizer to evaluate an agent across multiple models. The optimizer scores each option against established evaluation criteria and shows which one produces optimal responses. Users pick the model based on what performs best against those criteria.
“Tool Descriptions” let the optimizer improve how an agent understands and uses its function tools. It rewrites tool descriptions and parameter definitions so that the agent reliably picks the best tool.
In an example published online, Agent Optimizer was used to improve the quality and performance of a customer support agent. It was able to use synthetic data or historical traces of how the agent performed and identify where it fell short. It rewrote the functionality to strengthen return policies, escalation procedures, troubleshooting frameworks, and safety boundaries. The changes that were ultimately made were scored against the criteria defined by the developer.
The agent optimization process runs in the cloud, typically taking a few minutes. Customers that already have a hosted agent deployed can run Agent Optimizer, which requires no model retraining, no code changes, and no additional infrastructure.
In the earliest days of experimenting with AI agents, customers may have been able able to accept trial-and-error heavy processes as part of the new-technology learning curve. But as agents take on more critical business functions, and financial leaders develop stronger ROI expectations, tech leaders need tools and practices to bring enterprise rigor to their agents and the results they produce. Agent Optimizer is a positive step toward that objective.
More AI Agent Insights:
- With Autopilots, Microsoft Delivers Enterprise-Ready AI Agents Tapping OpenClaw
- Microsoft Governance Tools Ensure AI Agents Play Within the Rules at Runtime
- Microsoft Extends Reach of Copilot Cowork to Mobile Devices and New Data Sources
Ask Cloud Wars AI Agent about this analysis





