Microsoft is shipping autonomous agents into every Dynamics 365 module. The marketing message is “agents do work for you.” The architectural reality is that agents need orchestration patterns, and the patterns that work in pilots break in production. Here are the four shapes that hold up.
Pattern 1: single-purpose, tool-bounded agent
One agent, one job, a small set of tools. A “lead qualifier” agent in Sales reads a lead, calls a “look up firmographics” tool, calls a “score” tool, writes back a recommendation. The agent does not branch outside its tool set.
Holds up because: the failure surface is tiny, the tools are testable in isolation, and the agent’s reasoning is auditable.
Falls apart when: someone adds a fifth tool and the agent’s prompt no longer fits its decision space.
Pattern 2: orchestrator with worker agents
A top-level orchestrator agent receives the user’s request, routes to specialist agents (sales, service, finance), and returns a consolidated response. Each specialist owns its domain.
Holds up because: domain expertise stays separated, specialists can be updated independently, and the orchestrator’s job is just routing.
Falls apart when: the orchestrator’s routing logic becomes a soft branching engine and you have reinvented Power Automate badly.
Pattern 3: agent-as-flow-replacement
Replace a Power Automate flow with an agent that has the flow’s actions as tools. The agent decides which actions to call based on the trigger payload.
Holds up for: highly conditional flows where the branching matrix is large.
Falls apart for: simple linear flows where the agent is a 200ms tax over a deterministic 50ms execution.
Pattern 4: human-in-the-loop with confidence threshold
Agent does work, calculates a confidence score, executes if above 0.85 and queues for human review if below. Logs every decision to a agentdecision table for audit.
Holds up because: the failure mode is “human reviews more,” not “wrong action taken.”
Falls apart when: the queue length exceeds human capacity and someone disables the threshold.
The shared failure mode: hallucinated tool calls
Agents will sometimes invent tool names that do not exist or pass arguments in the wrong shape. Build a strict validation layer that rejects unknown tools before execution. Log the rejection so you can tune the prompt:
Tool registry validates: agent.toolName in allowedTools
On reject -> log to agentdecision table -> return structured error to agent
Observability is non-optional
Every agent invocation should emit: input, intermediate reasoning summary, tool calls, output, latency, token cost. Without this, you cannot debug, cannot optimize, cannot defend the agent’s actions in audit. Application Insights is the right destination.
What to do this week
For each agent in production, write down: which pattern it follows, what tools it owns, what its confidence threshold is, where its decisions are logged. Any agent missing one of these is a future incident.