What is the relationship between intelligent apps and human leadership in the workplace?

Intelligent apps only deliver real value when human leadership changes how decisions are made, not just how fast tasks are completed. Without redesigning decision rights and workflows, AI tools tend to accelerate broken processes rather than fix them.

Why does automating an existing process sometimes make things worse?

Automation increases the speed and volume of outputs, but if the approval or review process stays the same, the bottleneck simply gets worse. Teams end up rubber-stamping decisions to keep pace, which lowers quality while creating the illusion of better throughput.

How do I know if my organisation is ready for an AI automation rollout?

A good starting point is asking whether decision rights have been clearly assigned and whether leaders are willing to redesign how approvals and reviews work. If those structures are unchanged, adding AI tools is likely to surface existing problems faster rather than solve them.

When should I redesign workflows before deploying intelligent apps?

Workflow redesign should happen before deployment, not after. If the human steps in a process have not been updated to match the speed and volume AI generates, the technology will outpace the people it is meant to support.

When should I use a single agent vs multi-agent system?

Stick with a single agent until you run into one of three problems: too many tools causing the model to pick the wrong one, conflicting instructions that cannot coexist in one system prompt, or parallel workstreams that need to run at the same time. If none of those apply, a single agent is the right choice.

How do I know if my agent has too many tools?

A good rule of thumb is to keep your tool count under eight to ten connectors. Beyond that, tool descriptions start competing with each other and the model's ability to select the right one drops noticeably, even if prompt tweaks do not seem to help.

Why does a single agent struggle with conflicting instructions?

When a system prompt asks the model to take on two opposing behaviours, such as acting as a strict policy checker and a friendly explainer, those personas create tension within a single instruction set. The model tends to compromise in ways that produce unreliable outputs rather than handling each mode correctly.

What are the main reasons to build a multi-agent system?

Multi-agent systems are worth the added complexity when you need genuinely parallel workstreams, have irreconcilable prompt conflicts, or are dealing with tool overload that cannot be resolved through better descriptions. Outside of those scenarios, the extra coordination overhead rarely pays off.

Why does my Power Platform center of excellence setup stop working after a few weeks?

The CoE Starter Kit relies on scheduled sync flows that call admin connectors on a recurring basis. If the service account loses its licence, hits throttling limits, or has a permission issue, those flows fail silently and your dashboards show stale data without any obvious warning.

What licences and permissions does the CoE Starter Kit service account need?

The service account requires either a Power Platform Administrator or Global Administrator role, plus a per-user Power Automate licence that covers premium connectors. Without the premium entitlement, the admin connector calls used by the sync flows will not run.

How do I know if my CoE sync flows have stopped running correctly?

The dashboards will not alert you automatically when sync flows fail, so you need to monitor flow run history directly. Comparing your app and environment counts against known tenant activity over time is a practical way to spot when the inventory has drifted from reality.

Why does the CoE Starter Kit struggle with throttling on large tenants?

The sync flows paginate through every environment and every app in a single run, which generates a high volume of connector calls in a short period. This makes them prone to both platform-level and connector-level throttling, so transient errors need to be handled with retries rather than treated as permanent failures.

Category: Artificial Intelligence in Business

Microsoft’s Intelligent Apps Post Is About Leadership and I Think They Buried the Lede

I opened Microsoft’s April post on intelligent apps and human leadership expecting another speed pitch. Faster tasks. Faster decisions. Faster output. The usual rhythm. What I actually read was a piece where the words intelligent apps and human leadership are deliberately bolted together, and I think most people skimmed past the second half.

The leadership piece is doing all the work in that post. The intelligent app is the easy half. I keep seeing automation teams treat AI features as a productivity multiplier when the actual constraint is whether anyone in the org is willing to redesign how decisions get made. Skip that, and you ship faster versions of the same broken workflows.

I read the post expecting another speed pitch and got something else

Microsoft could have written a clean speed story. They have the numbers for it. Instead the framing is that intelligent apps need a new shape of work, and that shape is built by humans who lead differently, not by humans who type faster.

That is not a marketing flourish. That is the part of the message that decides whether your AI rollout pays off or quietly burns budget. I have been writing about decision ownership for a while now, and this post is the closest I have seen Microsoft come to saying it out loud.

Speed is a trap when the underlying decision rights have not moved

Here is what I keep running into. A team takes a slow approval process. They drop an agent in the middle of it. The agent now drafts the recommendation in two seconds. Approval still takes four days because three managers still need to sign off, and none of them changed how they review.

You did not speed up the process. You sped up the part nobody was waiting on.

Worse, the agent now produces ten times the volume of recommendations the approval chain was sized for. The queue grows. People rubber-stamp to keep up. The quality of the decision drops while the appearance of throughput goes up. I have written before that automating a bad process just makes it fail faster. Intelligent apps make this failure mode worse, not better, because the speed gap between the AI step and the human step gets wider. This dynamic is one reason RPA vs AI automation comparisons often miss the point — neither technology fixes a process where decision rights have not moved.

If decision rights do not move, speed is a trap.

What human leadership actually has to do for an intelligent app to work

The Microsoft post uses the phrase human leadership as if everyone knows what it means. I do not think we do. So here is what I think it has to mean operationally for an intelligent app to actually pay off.

First, someone with authority has to redraw the decision boundary. Which calls does the agent make on its own. Which calls go to a human. Which calls require two humans. That is not a developer task. That is a leadership task, and most orgs avoid it because it is uncomfortable.

Second, the constraints the agent operates under have to be owned by a person, not buried in a system prompt. This is exactly why Microsoft’s business skills in Dataverse matter. They give policy a home with an owner and a version history. Without that, your intelligent app is running on tribal knowledge that nobody can update.

Third, leaders have to stop measuring the team on volume of approvals or tickets closed. If the agent is doing the routine work, the human metric has to shift to quality of exception handling and quality of policy. Otherwise you are paying senior people to do work an agent already did.

None of this is a feature you ship. All of it is org design. The Power Platform tooling will not do it for you.

The teams I see getting this right are doing one specific thing first

The teams I talk to who are actually getting value out of intelligent apps and human leadership do one thing before they build anything. They write down, on one page, who currently owns each decision in the process and who will own it after the agent ships. Same column, different rows. The delta is the work.

That page is uncomfortable to produce. It surfaces the fact that some managers are about to lose a piece of their job, that some policies have no clear owner, and that some approval steps exist only because nobody ever questioned them. This is the part most teams skip, because it is political, not technical. It is also why most Power Platform Center of Excellence setups stall in month three — the governance conversation requires the same political work that most teams defer until it is too late.

The teams that skip it ship a working app and wonder six months later why nothing changed. The teams that do it ship a smaller app and quietly reshape how a department works.

So my read on the Microsoft post is this. They did not bury the lede by accident. The lede is human leadership. The intelligent app is the part of the story everyone is comfortable talking about. The other half is the part that decides whether any of this matters.

If you want to see how I think about this kind of org-shaped problem, more of my writing is on LinkedIn. The technology is rarely the bottleneck. The willingness to move decisions is.

Frequently Asked Questions

What is the relationship between intelligent apps and human leadership in the workplace?

Intelligent apps only deliver real value when human leadership changes how decisions are made, not just how fast tasks are completed. Without redesigning decision rights and workflows, AI tools tend to accelerate broken processes rather than fix them.

Why does automating an existing process sometimes make things worse?

Automation increases the speed and volume of outputs, but if the approval or review process stays the same, the bottleneck simply gets worse. Teams end up rubber-stamping decisions to keep pace, which lowers quality while creating the illusion of better throughput.

How do I know if my organisation is ready for an AI automation rollout?

A good starting point is asking whether decision rights have been clearly assigned and whether leaders are willing to redesign how approvals and reviews work. If those structures are unchanged, adding AI tools is likely to surface existing problems faster rather than solve them.

When should I redesign workflows before deploying intelligent apps?

Workflow redesign should happen before deployment, not after. If the human steps in a process have not been updated to match the speed and volume AI generates, the technology will outpace the people it is meant to support.

Source: Intelligent apps, human leadership, and the new shape of work (Microsoft Power Platform Blog).

May 7, 2026
Microsoft Just Reframed Dataverse as the Agent Data Platform and the Update List Is Worth Reading

Microsoft published a post on May 5 called Dataverse Is Your Agent Data Platform: Here’s What’s New. The framing is the part worth reading. Dataverse is no longer being sold as the database under your model-driven apps. It is being repositioned as the dataverse agent data platform, the layer that gives agents real business understanding instead of just rows and columns.

That is a meaningful shift in how Microsoft wants you to think about the stack. I have been waiting for this framing to land properly.

What it actually does

The update bundles several things that were previously scattered across announcements into one coherent story.

Knowledge sources let an agent ground itself in Dataverse tables, SharePoint sites, files, and external systems through a managed reference instead of a hand-rolled retrieval pipeline. The agent sees the data with its semantics: table relationships, choice columns, business rules. Not just a vector dump.

Business skills are now first-class records in Dataverse. I wrote about this in Microsoft Just Shipped Business Skills in Dataverse. Skills move policy and process logic out of system prompts and into a managed, owned, versioned artifact. The May 5 post confirms this is the intended pattern, not an experiment.

Deeper Fabric wiring means agents can reach analytical data through Dataverse without you stitching mirroring and shortcuts manually for every project. The semantic model carries through.

Copilot Studio integration is tighter on the agent side. Connected agents in Copilot Studio can pick up Dataverse knowledge sources and skills as native primitives instead of you wiring them through custom connectors and Power Automate flows.

None of these are individually new ideas. The point of the post is that they now line up as one platform story.

Why it matters

The hardest part of building an internal agent is not the model. It is getting the agent to behave like someone who actually works at your company. Org-specific policies, naming conventions, what counts as an active customer, which approval thresholds apply when. That tribal knowledge is where every agent project I have seen gets stuck.

If that knowledge lives in a 4000-token system prompt, the agent degrades. If it lives in hardcoded Power Automate flows, nobody can find it six months later. If it lives in a Word document on someone’s OneDrive, it might as well not exist.

The dataverse agent data platform framing says: put it in Dataverse as a typed, owned, versioned record, and let any agent on the stack consume it. That is an architectural decision, not a feature toggle. It changes who owns what. The policy owner updates the skill directly. The data steward owns the knowledge source. The agent builder stops guessing.

The risk I keep flagging is skill sprawl. The moment multiple teams start writing skills with overlapping scopes, the agent starts misrouting silently. Governance has to come before deployment, not as a cleanup project in month six. Microsoft is shipping the platform. The org design is still on you. That same principle applies to Power Platform governance that does not kill adoption — the structural decisions made early determine whether the whole thing scales or stalls.

I also want to see how this plays with multi-agent orchestration. I covered the patterns in multi-agent orchestration in Copilot Studio. If knowledge sources and skills become the shared substrate that connected agents draw from, the per-agent prompt size drops and routing gets more reliable. That is the part I find genuinely exciting.

What I would do with it this week

Three concrete things, in order.

First, pick one existing agent that has a bloated system prompt. Anything past 2000 tokens. Pull out the policy chunks and rewrite them as business skills in Dataverse. Measure the prompt size before and after, and run your behavioral tests on both. You will see the consistency change in the edge cases.

Second, take one knowledge source that today is glued together with a custom connector or a Power Automate flow doing retrieval. Replace it with a Dataverse knowledge source pointing at the same data. Compare the answers on questions that depend on relationships between tables. The native version handles joins the hand-rolled one fakes.

Third, draft a one-page skill ownership matrix for your tenant before anyone writes a third skill. Who owns customer policy, who owns finance approvals, who owns HR routing. Boring document. Saves you from the sprawl that kills these projects.

Read the Power Platform docs alongside the announcement. The framing finally matches what practitioners have been building toward. I am curious where it lands by the end of the quarter.

Source: Dataverse Is Your Agent Data Platform: Here’s What’s New (Microsoft Power Platform Blog).

May 6, 2026
Anthropic Raised Claude Rate Limits for SpaceX and That Tells You Where Enterprise AI Is Heading

Anthropic announced a deal with SpaceX that includes higher Claude rate limits as part of the engagement. The headline most people will read is the SpaceX logo. The actual signal is different. The anthropic spacex higher rate limits story tells you that capacity is now a negotiated enterprise lever, not a number on a pricing page.

If you are building agents seriously, this is the part to pay attention to.

What it actually does

The deal gives SpaceX elevated rate limits on Claude, alongside the usual enterprise engagement wrapping. Anthropic frames it as supporting frontier engineering work where teams need sustained throughput for code generation, document analysis, and agent loops at scale.

The published tiers on the Anthropic site still exist. Usage tier 1, tier 2, tier 3, tier 4 with their requests-per-minute and input-tokens-per-minute caps. What this announcement quietly confirms is that above those tiers, the conversation is bespoke. You sign a contract, you get a number that fits your workload.

That has been true behind the scenes for a while. Saying it out loud, with a customer name attached, is the new part.

Why it matters

Most teams I talk to still treat rate limits like an afterthought. They build a prototype on a developer key, the latency feels fine, the cost looks reasonable, and they move toward production. Then a real workload hits and the 429s start.

I wrote about this in a different shape when I covered Claude running on Amazon Trainium. The benchmark conversation distracts from the real production failure mode, which is capacity drift at peak hours. Rate limits are the same story from a different angle. Throughput is now part of the architecture, not a footnote.

Three things follow from this.

First, the gap between what a hobbyist API key can do and what a serious enterprise workload needs is widening fast. A single agent loop with tool calls, retries, and a few sub-agents can burn through a tier 2 limit in seconds. Multi-agent orchestration makes it worse. If you are running ten parallel agent invocations from a Power Automate flow, each with their own context window, you will hit the ceiling before you hit the budget. If you are thinking through whether your workload even needs that kind of parallelism, the honest answer on single-agent vs multi-agent design is worth reading before you scale out.

Second, procurement now needs to ask different questions. Not just price per million tokens. Sustained tokens per minute. Concurrency. Burst tolerance. Region. What happens when your traffic doubles next quarter. Most enterprise AI contracts I hear about from people at other organisations still get signed without these numbers nailed down. Anthropic is moving further in this direction across the board, and their enterprise AI services arm is exactly the context in which these bespoke capacity conversations are going to happen.

Third, the platform vendors are going to feel pressure here. If you are running Claude through Bedrock or through Copilot Studio, your effective rate limit is shaped by both Anthropic and the platform layer. The platform abstracts capacity, which is convenient until it is not. Knowing where the ceiling actually sits in your stack is going to matter more, not less.

What I would do with it this week

If you are running anything beyond a demo, instrument the throughput. Not just success and failure counts. Tokens per minute, requests per minute, p95 latency, and 429 rate, broken down by flow or agent. You cannot negotiate a number you have not measured.

Then look at your system prompts. Token bloat is the cheapest capacity win available. A prompt that drifted from 800 to 4000 tokens over a few sprints is not just costing you money, it is eating your throughput ceiling on every single call. I have seen this kill production agents at peak hours when nobody changed the model or the workload.

Then map your workload to a tier. If your steady state is comfortably inside a published tier, fine. If you are within forty percent of the ceiling on any axis, you are already in negotiation territory. Start the conversation before the incident, not after.

For teams building on Power Platform with Claude in the loop, the same logic applies through whatever connector or custom action you are using. Concurrency settings on a Power Automate flow can mask the real call pattern until it does not. Know what your worst minute looks like.

The SpaceX deal is a marker. Capacity has joined price and capability as a first-class procurement axis for enterprise AI, and the teams treating it that way now will have one less surprise next year. (My ongoing notes on this stuff live on my LinkedIn if you want to follow along.)

Source: Higher usage limits for Claude and a compute deal with SpaceX \ Anthropic (Anthropic).

May 6, 2026
Anthropic Just Launched Claude Finance Agents and the Specialization Trend Is Real

Anthropic shipped Claude Finance Agents this week. A set of Claude-powered agents built specifically for financial analysts, with native connectors to LSEG, Moody’s, S&P Global, and Morningstar. This is not a generic chat assistant pointed at a finance prompt. The Claude finance agents are a packaged product with the data plumbing already done.

I have been watching this trend build for months. This release makes it impossible to ignore.

What Anthropic actually shipped

Three agents wrapped around specific analyst workflows. One handles modeling. One handles due diligence. One handles comparable company analysis. Each one is a Claude agent with a defined scope, a system prompt tuned to that workflow, and direct connectors into the data providers analysts already pay for.

The connector list is the part that matters. LSEG for market data. Moody’s for credit. S&P Global for company financials. Morningstar for funds. These are not scraped sources. They are licensed enterprise feeds. Anthropic did the integration work that every internal team building a finance copilot would otherwise have to do themselves.

You can read the full announcement on Anthropic’s site. The pricing structure is enterprise. The target user is clear. This is not a consumer move.

Why this release matters beyond finance

For two years the pitch from foundation model vendors was: here is the model, build whatever you want. That is over.

Now the pitch is: here is the model, here is the agent, here are the connectors, here is the workflow. The vendor is moving up the stack into the application layer. Anthropic is doing it for finance. Microsoft is doing it for general enterprise productivity. OpenAI is doing it for coding and research. The pattern is consistent. Anthropic launching an enterprise AI services arm was an early signal of exactly this direction.

This changes the build-vs-buy math in a real way. If you are an enterprise team that was about to spend six months building a Claude-based comparable company analysis agent on top of a generic platform, you now have to ask whether your custom version will actually beat what Anthropic ships out of the box. Most of the time, in the specific domains where these vertical agents land, the answer will be no.

That does not mean custom builds are dead. It means the line moves. Custom builds make sense where the vendor product does not exist or does not match your specific data and policies. Generic finance modeling? Probably not worth building. Your firm’s specific deal screening logic with your proprietary scoring model? Still custom.

The other thing this release confirms is that tool design is product design. I have written before that agentic workflows live or die on the quality of their tool layer. Anthropic clearly figured this out. Wrapping LSEG and S&P data with proper structured outputs that Claude can reason about is the actual hard work. Anyone who has tried to build this on top of raw connectors knows.

This specialization pattern also raises a real architectural question: when the vendor ships a domain-specific agent, does your orchestration layer treat it as a peer, a sub-agent, or a replacement? That is the same question I work through in when to build a multi-agent system instead of a single agent.

What I would do with it this week

I do not work in finance, so I am not deploying this in production. But here is what I would do if I were on a finance team, and what I am doing in adjacent domains.

First, audit any internal agent project that overlaps with what Anthropic just shipped. If a team has been building a comparable company analysis tool for four months and Anthropic just released one, that conversation has to happen now, not in Q3.

Second, look at the connector list and ask which of those data sources your team already licenses. The value of Claude Finance Agents drops fast if you do not have LSEG or S&P feeds. Vendor lock through data integration is the real moat here.

Third, think about what the equivalent looks like in your domain. If Anthropic shipped finance agents in May 2026, what does an HR agent product look like? A legal one? A procurement one? Someone is building each of these. Probably more than one someone. In my experience, the teams that win the build-vs-buy decision are the ones that ask the question early, not the ones that finish their custom build and then discover the vendor product. The same specialization logic is visible in Microsoft Discovery as the first real glimpse of domain-specific agent platforms.

For Power Platform builders, this is also a useful signal. Copilot Studio is Microsoft’s answer to the same trend, and the business skills work in Dataverse is the integration layer equivalent. The shape of the market is clear.

The era of generic agent platforms competing on model quality alone is closing. The next round is about who owns the workflow.

Source: Agents for financial services \ Anthropic (Anthropic).

May 5, 2026
Microsoft Just Shipped Business Skills in Dataverse and This Is How You Teach Agents Your Org

Microsoft announced business skills in Dataverse on May 1, and this is the announcement I have been waiting for. Dataverse business skills for agents let you encode org processes, policies, and the tribal knowledge that lives in people’s heads as natural-language instructions. Agents discover them and follow them at runtime. No more cramming everything into a 4000-token system prompt and hoping the model remembers how your finance team handles approvals.

I have been reading the docs since Friday. Here is my honest take.

What business skills actually do

A business skill is a Dataverse record. It contains a natural-language description of when the skill applies, what the agent should do, and what data or actions it can use. Agents query Dataverse at runtime, find the skills that match the user’s intent, and follow the instructions inside.

The shape matters. You are not writing code. You are writing the kind of paragraph you would send to a new hire on day one. Things like: When someone asks about expense approvals over 5000 EUR, route to the regional finance lead, never to the team manager. The lookup table is in the Finance Approvers table. Always confirm the amount and the cost center before submitting.

That description is stored, versioned, and indexed. Multiple agents can use the same skill. You update the skill once and every agent that discovers it picks up the new behavior. There is also a permission layer, so a skill can be scoped to a security role, a team, or an environment.

Underneath, this is grounding. The agent does not memorize your org. It retrieves the relevant skill at runtime and follows it.

Why this changes how you build internal agents

The hardest part of deploying internal agents has never been the model. It has been getting the agent to behave like someone who actually works at your company. The model can reason. It cannot know that your procurement policy changed in March, or that the Madrid office handles APAC tickets on Wednesdays because of a coverage gap.

Until now, that context lived in three bad places. System prompts that grew until they hit the context window and started degrading. Power Automate flows with hardcoded business logic that nobody could find six months later. Or worse, it lived nowhere and the agent guessed.

I have written before that a focused 400-token instruction set produces more reliable behavior than a 4000-token one. Business skills make that practical. You stop stuffing the prompt and start composing skills. The agent picks the right ones for the job.

The other thing this fixes is ownership. A business skill in Dataverse has an owner, an audit trail, and a lifecycle. When the policy changes, the policy owner updates the skill. They do not need to find the agent maker, file a ticket, or wait for a release. That is a real architectural shift, not a feature flag. If your org is still working out who owns what when policies change, Power Platform governance that does not kill adoption covers how to structure that before it becomes a cleanup problem.

The risk I am watching: skill sprawl. If every team writes their own skills with overlapping scopes, the agent will face the same routing problem multi-agent setups face. Skill descriptions will start competing with each other and you will get silent misrouting. Governance has to come early, not as a cleanup project in month six.

What I would do with it this week

Pick one painful, well-bounded process. The kind where the answer is always it depends on who you ask. Approval routing is a good candidate. Onboarding checklists work too.

Write three to five business skills that capture the rules. Keep each one short and specific. Connect them to a Copilot Studio agent that already has the right Dataverse and Power Automate connectors. Test with the messy questions, not the clean ones. Watch which skills get picked and which do not.

The thing to measure is not whether the agent answers correctly. It is whether the right skill was selected for the right phrasing. If selection is unreliable, your skill descriptions are too similar or too vague. Rewrite them and try again. If you are also wiring up custom connectors to extend what the agent can reach, How to Build a Custom Connector for Copilot Studio Step by Step is worth keeping open in a tab.

I will be writing more about this once I have run a real internal pilot. Early signals are good. In my experience, the patterns that survive contact with production are the ones where context is stored where it belongs, not where it was convenient at build time.

This one belongs in Dataverse. Finally.

Source: Introducing business skills: Teach agents how your organization works (Microsoft Power Platform Blog).

May 5, 2026
When should I build a multi-agent system instead of a single agent?

Short answer: stay single-agent until you hit one of three specific failure modes. Tool overload past roughly 8 to 10 connectors. Conflicting system prompts that cannot be reconciled in one instruction set. Or genuinely parallel workstreams that need to run at the same time. If none of those apply, the single agent vs multi-agent question is already answered. Build one agent.

Most posts on this topic make multi-agent sound like the natural next step. It is not. It is a tax you pay when a single agent can no longer do the job, not an upgrade you take because it sounds more sophisticated.

The longer answer

I read a good piece on Towards Data Science walking through ReAct workflows and when scaling to multi-agent makes sense. The framing matched what I keep running into when I talk to people building on Copilot Studio and similar platforms.

A single agent is a loop. It reads, picks a tool, calls the tool, reads the result, picks again, until it decides it is done. That loop works well when the tool list is small enough for the model to reason over cleanly and the instructions do not pull the model in two directions.

It starts breaking in predictable places.

The first is tool overload. I have written before about how a model that hits 95 percent accuracy on tool selection with two connectors can drop to 70 percent with five, because tool descriptions start competing with each other. By the time you have ten or twelve tools, the agent picks the wrong one regularly and you cannot fix it with prompt tweaks.

The second is prompt conflict. If your agent needs to behave like a strict policy checker for one task and a friendly explainer for another, those two personas fight inside one system prompt. You can feel it in the outputs. The model compromises in the wrong direction.

The third is parallelism. A single agent loop is sequential by design. If you have three independent workstreams that must run at the same time, no amount of prompt engineering will make a single ReAct loop parallel. This is also where thinking through whether AI automation is even the right fit versus a simpler RPA approach becomes worth the time.

Everything else, latency, observability, prompt size, can usually be solved without splitting agents.

How to decide in practice

I use a short checklist when someone asks about single agent vs multi-agent for a Copilot Studio build.

Count the tools. If you are under eight connectors and the descriptions do not overlap, one agent is fine. Past ten with overlap, start thinking about splitting.

Read the system prompt out loud. If it contains contradictory instructions for different scenarios, that is a real signal. Splitting reduces prompt size per agent, and a focused 400-token instruction set produces more reliable behavior than a 4000-token one. I covered this in more detail in my post on multi-agent orchestration patterns in Copilot Studio.

Map the workstreams. Are they actually independent, or are they sequential steps you are calling parallel because it sounds nicer? Most automation work is sequential. Real parallelism is rarer than people think.

Budget the latency. Every hop between agents adds round-trip overhead. If you split a single agent into three, you have just added two more model calls and two more HTTP boundaries to every request. I have written about how accumulated round-trip overhead kills perceived performance long before any single call gets slow.

If the checklist points to multi-agent, default to a supervisor pattern. One parent agent routes to focused child agents. Skip the peer network where agents call each other freely. It looks elegant in diagrams and is painful to debug in production.

Microsoft has shipped real multi-agent orchestration in Copilot Studio, so the platform support is there. The question is whether your problem actually needs it.

Related gotchas

Routing in Copilot Studio multi-agent setups depends on the description you write for each connected agent, not on trigger phrases. A vague description causes silent misrouting that is harder to debug than a broken trigger. Write descriptions like API contracts, not marketing copy.

When a parent agent picks the wrong child confidently, you get the same failure mode as a single overloaded agent, just one layer deeper. Splitting agents does not eliminate misrouting. It moves it.

Token costs multiply faster than you expect. Each agent in the chain re-processes context. Three agents in a sequence is not three times the cost. It is often closer to five or six times once you count the context each one needs to reason properly.

If you are still on the fence, build single first. You can always split later. Going from multi-agent back to single, on the other hand, almost never happens once the architecture is in place. That is the trade-off worth keeping in mind. More on how I think about these trade-offs here.

Frequently Asked Questions

When should I use a single agent vs multi-agent system?

Stick with a single agent until you run into one of three problems: too many tools causing the model to pick the wrong one, conflicting instructions that cannot coexist in one system prompt, or parallel workstreams that need to run at the same time. If none of those apply, a single agent is the right choice.

How do I know if my agent has too many tools?

A good rule of thumb is to keep your tool count under eight to ten connectors. Beyond that, tool descriptions start competing with each other and the model’s ability to select the right one drops noticeably, even if prompt tweaks do not seem to help.

Why does a single agent struggle with conflicting instructions?

When a system prompt asks the model to take on two opposing behaviours, such as acting as a strict policy checker and a friendly explainer, those personas create tension within a single instruction set. The model tends to compromise in ways that produce unreliable outputs rather than handling each mode correctly.

What are the main reasons to build a multi-agent system?

Multi-agent systems are worth the added complexity when you need genuinely parallel workstreams, have irreconcilable prompt conflicts, or are dealing with tool overload that cannot be resolved through better descriptions. Outside of those scenarios, the extra coordination overhead rarely pays off.

Source: Attention Required! (Towards Data Science).

May 4, 2026
Anthropic Is Launching an Enterprise AI Services Arm and That Changes the Vendor Conversation

Anthropic announced it is standing up a dedicated enterprise AI services company to help large organizations deploy Claude in production. This is the kind of move that does not look loud on a Monday but reshapes how anthropic enterprise ai services conversations go inside large orgs for the next two years.

Until now, if you wanted hands-on Anthropic help inside your organisation, you went through a partner, hired a boutique, or figured it out yourself. That is changing.

What Anthropic actually announced

Anthropic is launching a services arm focused on enterprise deployment. Not just API access. Not just Claude for individuals. A real services org built to sit next to enterprise teams and help them stand up Claude inside production environments.

That means architecture work, integration help, deployment patterns, and the kind of hands-on engagement that used to belong exclusively to Microsoft Consulting Services, Accenture, Deloitte, and the big SI bench. Anthropic is now in that conversation directly.

It is worth being precise here. This is not Anthropic becoming a generic consultancy. The framing is narrower: help enterprises actually deploy Claude in production for high-value use cases. That is a much sharper offer than “AI transformation,” and it is the kind of focus that tends to ship working systems instead of slide decks.

Why this matters for enterprise AI buyers

For most of 2024 and 2025, if you were building AI automation inside a large enterprise, your reference architecture defaulted to the Microsoft stack. Azure OpenAI, Copilot Studio, Power Platform, the whole vertical. That is not because Microsoft is always the best fit. It is because Microsoft has the procurement story, the EA discount, the field engineers, and the reference architectures already on the shelf.

Anthropic just moved to close that gap.

A credible second source for enterprise AI deployment work changes three things in real procurement conversations. First, you can now run a genuine bake-off where the non-Microsoft option has hands-on support, not just a model endpoint. Second, reference architectures stop being Microsoft-shaped by default. Third, the negotiation leverage shifts. When the only credible vendor is also your cloud provider, your CRM, your collaboration suite, and your AI copilot, you are not really negotiating.

I have written before about why the Bedrock vs direct Anthropic API question is a governance decision, not a model decision. This announcement is the next step on that same line. Anthropic is acknowledging that getting Claude into a regulated enterprise is not a model problem. It is a deployment problem. They are now staffing for that reality.

The honest part: services orgs at model companies are hard. The talent market for AI engineers who can also navigate enterprise IT is brutal. Anthropic will get pulled into pre-sales work, RFP responses, and stakeholder meetings that consume engineering capacity. Whether they keep the focus tight or drift into generic consultancy is the open question.

What I would do with this news this week

If you are anywhere near AI procurement or architecture decisions, three concrete things.

First, put Anthropic on your shortlist for the next AI deployment review. Not as a model. As a deployment partner. The conversation is now legitimately different from “call the AWS rep about Bedrock.”

Second, revisit your reference architectures. If yours quietly assumes the Microsoft stack from end to end, write down why. Some of those reasons will hold up. Some will turn out to be “because that is what the last project did.” Those are the ones to challenge.

Third, if you are a Power Platform shop, this does not mean ripping anything out. Copilot Studio, Power Automate, and the Microsoft surface area are still where citizen development happens. But the heavy orchestration brain behind your agents does not have to be Azure OpenAI by default. I have been thinking about this for months, and as I covered in Claude vs ChatGPT Is the Wrong Question When You Are Building Automations, the model choice sitting behind your flows is less important than the deployment and governance story around it. It is now a real option with real support behind it.

The interesting enterprise AI conversations for the rest of this year are not going to be about which model wins a benchmark. They are going to be about who shows up when you need to ship. From what I see in the community, that is exactly the conversation Anthropic just inserted itself into.

And if the services arm delivers on its framing, it will also change how multi-agent orchestration patterns get designed in the first place — because you will finally have an Anthropic-native team in the room when those architecture decisions get made.

The vendor field just got more interesting.

Source: Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs \ Anthropic (Anthropic).

May 4, 2026
Microsoft Open Sourced the Azure Integrated HSM Design and That is a Bigger Deal Than It Sounds

Microsoft open-sourced the design of the Azure Integrated HSM. The azure integrated hsm open source release is not a marketing move dressed up as transparency. It is the hardware security module that sits in Azure silicon and anchors key protection for workloads running on top, and the design is now public for anyone to read, audit, and pick apart.

I have been reading through the announcement and the surrounding material for a couple of days. My honest first take: this matters more than the headline suggests, especially if you are building anything agentic that touches sensitive data.

What it actually does

Azure Integrated HSM is a hardware security module Microsoft designed in-house and integrates into Azure servers. The job of an HSM is narrow and important. It generates, stores, and uses cryptographic keys inside tamper-resistant hardware so that the keys never leave the chip in plaintext. Encryption, signing, and key wrapping happen inside the module. The application above it gets the result, not the key.

What shipped this week is the design itself. Schematics, firmware interfaces, the cryptographic boundary, the attestation flow. Open-sourced for review. Not the silicon, the design.

This sits underneath services people actually use. Azure Key Vault Managed HSM, confidential computing workloads, the key material protecting storage and databases, and increasingly the trust roots for AI inference where prompts and outputs cannot be exposed to the host. If you have ever clicked “customer-managed key” on an Azure resource, something like this was already in the path. The shift is that you can now read how it works.

Why it matters

Cloud trust has been a faith-based exercise for a long time. You read the compliance certifications, you trust the vendor, you move on. That worked when the workloads were a SQL database and a web app. It works less well when the workload is an agent making autonomous decisions over sensitive data, calling tools, and producing outputs that have to be cryptographically attributable.

Open-sourcing the HSM design changes the trust model from “Microsoft says it is secure” to “here is the design, run it past your own cryptographers.” That is a real shift. Apple did something similar with Private Cloud Compute last year, publishing the design and inviting external researchers in. The pattern is becoming the bar for any infrastructure provider that wants to host AI workloads with sensitive data.

The other reason it matters: agentic workloads will multiply the number of cryptographic operations per user request by an order of magnitude. Every tool call that needs a signed token, every cross-service hop that needs an attestation, every model output that needs to be tied back to a verified context. The HSM is no longer a sleepy compliance box. It is in the hot path.

I have written before about latency in agentic workflows. Cryptographic operations are part of that budget. Knowing how the hardware actually works, and being able to reason about what it costs per call, stops being academic.

What I would do with it this week

I am not going to pretend I will sit down and audit silicon firmware this week. I will not. But there are concrete things worth doing if you build on Azure and you care about where your keys live.

First, read the design document end to end. Even at a surface level, understanding the attestation flow, the key hierarchy, and the boundary between firmware and host gives you a much better mental model when you are reasoning about Key Vault, Managed HSM, and confidential computing. The Managed HSM docs become much more useful once you can picture what is underneath.

Second, look at where in your current architecture you are accepting hardware-rooted trust on faith. If you are building Power Platform solutions that pull from sensitive data sources, the keys protecting that data sit in this stack. Decisions about who owns and governs that data access matter too — something I covered in Power Platform Governance That Does Not Kill Adoption. If you are building Copilot Studio agents that call into systems holding regulated content, your trust chain runs through here. Knowing the chain is the first step to defending it in a design review.

Third, watch how the community responds. Open-sourcing a design only matters if people actually look. The interesting signal over the next few months will be what independent researchers find, what they push back on, and how Microsoft responds. That conversation is more informative than any vendor whitepaper.

For a deeper dive into the rationale, the Azure blog post is the place to start. My own running notes on infrastructure shifts like this end up on my LinkedIn as I work through them.

Inspectable infrastructure is becoming the floor for serious AI workloads, and this release nudges that floor higher. The broader question of who owns the decision when agents act autonomously over that infrastructure is the next thing worth thinking through.

Source: Enforcing trust and transparency: Open-sourcing the Azure Integrated HSM (Azure Blog).

May 4, 2026
Inside a Power Platform Center of Excellence: Why Most Setups Stall in Month Three

Most people think a Power Platform Center of Excellence setup works like installing a product. You import the CoE Starter Kit solution, run the setup wizard, point it at your tenant, and the dashboards fill up. Job done.

That is the surface behaviour. The actual mechanism underneath is a chain of dependencies, sync jobs, and admin connector calls that quietly degrade if any one link breaks. I keep seeing teams hit this on LinkedIn and in conversations with people at other organisations. The kit looks healthy for six weeks, then the inventory stops matching reality and nobody knows why.

Let me walk through what is actually happening underneath.

What you see on the surface

You install the CoE Starter Kit, the wizard provisions a Dataverse environment, and a set of cloud flows starts populating tables like Environments, Apps, Flows, and Makers. The Power BI dashboard lights up. You see a maker count, an app count, an orphaned resource list.

From the outside, it looks like the kit is scanning your tenant. It is not scanning anything in real time. Every number you see is the result of scheduled flows that ran sometime in the last 24 hours, hit admin connectors, paginated through results, and wrote rows into Dataverse. The dashboard is just a read on that table.

This matters because the moment those flows stop succeeding, your dashboard stops being true. And it does not tell you it stopped being true.

The underlying mechanism

The CoE kit runs on a stack of sync flows. The most important ones are Admin Sync Template v3 (environments), Admin Sync Template v4 (apps and flows), and the maker activity flows. Each one authenticates as the service account you set up during install and calls the Power Platform for Admins, Power Apps for Admins, and Power Automate Management connectors.

Three things have to be true for those flows to keep working. The service account needs an active Power Platform Administrator or Global Administrator role. The account needs a per-user Power Automate licence with the right premium entitlements, because the admin connectors are premium. And the account needs to not be hitting throttling limits while paginating through a tenant with thousands of resources.

The CoE sync flows are exactly the kind of workload that hits both platform-level and connector-level throttling, because they loop through every environment and every app in the tenant in one run. Getting your Power Automate error handling patterns right matters here — transient throttling errors need to be caught and retried differently from terminal failures, or the sync silently drops data.

Where it breaks

The most common failure mode is not the install. It is month three.

The service account password expires, or MFA gets enforced tenant-wide, or someone removes the admin role because of a security review. The flows start failing silently. Default retry logic masks it for a week or two. Then the runs hit timeout and stop entirely. The dashboard freezes on stale data, but the numbers still look plausible, so nobody notices.

The second failure mode is scale. The kit was designed for small to medium tenants. If you have 40,000 apps and 80,000 flows across hundreds of environments, the sync flows do not finish inside the 30-day Dataverse retention window for run history. You lose visibility into your own automation.

The third one is the licensing trap. Teams install the kit on a trial, then move to production without giving the service account a proper premium licence. The flows technically run, but premium connectors throw 403s on specific calls, and only some tables populate. Half the dashboard works. The other half lies.

What this means for how you build it

Treat the CoE as a product you operate, not a kit you install. That changes a few decisions.

Use a dedicated service principal with certificate auth where the connectors support it, instead of a user account with a password. The service principal does not expire, does not get MFA, does not get caught in a leaver process. Where you must use a user account, document it, monitor it, and put the password rotation in a runbook owned by a real team.

Build a health check flow that runs daily and alerts when the last successful sync timestamp on each core table is older than 48 hours. Do not trust the dashboard to tell you the dashboard is broken.

For larger tenants, split the sync flows by environment group instead of running them tenant-wide. The kit supports filtering, and partial visibility refreshed daily beats full visibility refreshed never.

Decide what governance question the CoE is actually answering for you before you build dashboards on top of it. Inventory is not governance. A list of 12,000 apps with no owner attached is just a longer problem. The broader challenge of Power Platform governance that does not kill adoption is worth thinking through before you design your DLP and ownership policies around what the CoE surfaces, because the data is only useful if makers trust the system enough to stay inside it.

The CoE Starter Kit is genuinely good engineering. It just is not magic. If you are starting to build out more automation on top of your tenant inventory, the question of why Power Automate is still worth learning in 2026 is a good framing for where to focus the team’s time once the CoE is stable. If you want to compare notes on how other teams are running theirs, I am always up for that conversation.

Frequently Asked Questions

Why does my Power Platform center of excellence setup stop working after a few weeks?

The CoE Starter Kit relies on scheduled sync flows that call admin connectors on a recurring basis. If the service account loses its licence, hits throttling limits, or has a permission issue, those flows fail silently and your dashboards show stale data without any obvious warning.

What licences and permissions does the CoE Starter Kit service account need?

The service account requires either a Power Platform Administrator or Global Administrator role, plus a per-user Power Automate licence that covers premium connectors. Without the premium entitlement, the admin connector calls used by the sync flows will not run.

How do I know if my CoE sync flows have stopped running correctly?

The dashboards will not alert you automatically when sync flows fail, so you need to monitor flow run history directly. Comparing your app and environment counts against known tenant activity over time is a practical way to spot when the inventory has drifted from reality.

Why does the CoE Starter Kit struggle with throttling on large tenants?

The sync flows paginate through every environment and every app in a single run, which generates a high volume of connector calls in a short period. This makes them prone to both platform-level and connector-level throttling, so transient errors need to be handled with retries rather than treated as permanent failures.

May 3, 2026

RPA vs AI Automation for Enterprise Workflows

The decision I keep watching teams get wrong: should this workflow be built with RPA or with an AI agent. The RPA vs AI automation debate gets framed as old tech versus new tech, which is the wrong frame entirely. They solve different problems. Picking the wrong one is how you end up with a fragile bot that needs babysitting or an agent that hallucinates its way through invoice approvals.

I have built both inside a large org. Here is how I actually decide.

Determinism and predictability

RPA assumes the screen, the field, and the click path are the same every time. If the SAP transaction code is VA01 today and VA01 tomorrow, RPA wins. It will execute that path 10,000 times with zero variance.

AI automation assumes variance is the input. The email phrasing changes, the PDF layout changes, the customer asks the same thing five different ways. An agent reasons over that variance. It is non-deterministic by design, which is a feature for unstructured input and a liability for structured execution.

Rule of thumb I use: if I can write the decision tree on a whiteboard in 15 minutes, it is RPA work. If the decision tree has more than 30 branches and half of them are “it depends on the wording,” it is agent work.

Cost per execution

Dimension	RPA (Power Automate Desktop)	AI Agent (Copilot Studio)
Per-run cost	Near zero after license	Roughly 1 message credit per turn, often 5 to 15 turns per task
License model	Per-bot or per-user attended/unattended	Message packs, 25,000 messages per pack
Scaling cost	Linear with bot count	Linear with conversation volume and tool calls
Failure cost	Bot stops, you fix it	Agent confidently completes the wrong task

RPA at 100,000 runs a month is basically free compute after the license. An agent at 100,000 runs is not. I have seen teams underestimate this by an order of magnitude because they tested with 50 runs and extrapolated linearly without counting tool calls and orchestration turns.

Maintenance and brittleness

RPA breaks when the UI changes. A vendor pushes a new SAP Fiori update, three selectors shift, your bot fails at 3am. I have lived this. The fix is usually 30 minutes, but you need someone on call who knows the bot.

AI agents break differently. They do not fail loudly. They drift. The model provider updates, your prompt that worked last month now produces a slightly different output format, and downstream parsing silently fails. I wrote about this in my agentic workflow post. The failure mode is worse because users find out three days later when the wrong invoice gets paid. If you are building flows that sit underneath an agent, Power Automate error handling patterns that actually work will save you from the silent failures that surface weeks after go-live.

RPA maintenance is reactive and obvious. Agent maintenance is proactive and requires evaluation infrastructure most teams do not build.

What the work actually looks like

This is the dimension nobody compares on. Look at the input.

Structured input, structured output, no judgment needed: RPA. Copying 200 rows from a legacy system into a SharePoint list, kicking off a daily report, screen-scraping a vendor portal that has no API. Boring, repetitive, deterministic. Power Automate Desktop handles this all day. If you are still deciding whether to invest time in the broader platform, RPA is not the right tool for every repetitive task is worth reading before you commit to a build.

Unstructured input, structured output, judgment needed: AI. Reading 500 supplier emails and extracting the PO number, classifying tickets by intent, summarizing a 40-page contract into five bullet points. This is where Copilot Studio or a custom agent earns its cost.

The hybrid case is the most common one and the one most teams miss. The agent reads the email, extracts the structured fields, then hands off to an RPA bot or a cloud flow that executes the deterministic part. The agent is the reasoning layer. RPA is the execution layer. They are not competitors. They are stacked.

Governance and auditability

RPA logs are simple. Action ran, action succeeded, here is the screenshot. Auditors love this.

AI agents need decision logs, not just execution logs. You need to capture why the agent picked tool A over tool B. Most teams I talk to are not logging this and will get caught when the first compliance review hits. I covered this in The Real Shift Is Not Faster Work It Is Who Owns the Decision. Based on what I have built, this is the gap that bites you 6 months in, not on day one.

Choose RPA if / Choose AI if

Choose RPA if: the input is structured, the path is deterministic, the volume is high, the cost per run needs to be near zero, and the system has no API. This is most legacy integration work.

Choose AI automation if: the input is unstructured, the work requires classification or extraction or summarization, variance is the norm, and you have the evaluation discipline to catch silent drift.

Choose both if: you have a real workflow. Most enterprise automation is hybrid. The line is not RPA versus AI. It is figuring out which layer does what.

Frequently Asked Questions

What is the difference between RPA vs AI automation for enterprise workflows?

RPA is built for repetitive, predictable tasks where the process follows the same steps every time, while AI automation handles unstructured or variable inputs that require reasoning. They are not competing technologies but tools suited to different problems. Choosing the wrong one leads to either a fragile bot or an agent making confident mistakes.

When should I use RPA instead of an AI agent?

Use RPA when your process is consistent, rule-based, and can be mapped out as a clear decision tree. If the same fields, screens, or steps repeat thousands of times without variation, RPA will be faster, cheaper, and more reliable than an AI agent.

How do I know if AI automation is worth the cost for my workflow?

AI agents consume message credits per turn and most tasks require multiple turns, so costs scale quickly at high volumes. Before committing, calculate expected monthly runs and multiply by average turns per task, not just per conversation. Teams often underestimate this significantly when testing at small scale.

Why does RPA break so often in enterprise environments?

RPA relies on fixed UI selectors, so any interface update from a vendor can shift elements and cause the bot to fail. These failures are usually quick to fix but require someone familiar with the bot to be available when issues occur. Unlike AI agents, RPA fails loudly and immediately rather than silently producing wrong results.

May 1, 2026

Category: Artificial Intelligence in Business

I read the post expecting another speed pitch and got something else

Speed is a trap when the underlying decision rights have not moved

What human leadership actually has to do for an intelligent app to work

The teams I see getting this right are doing one specific thing first

Frequently Asked Questions

What is the relationship between intelligent apps and human leadership in the workplace?

Why does automating an existing process sometimes make things worse?

How do I know if my organisation is ready for an AI automation rollout?

When should I redesign workflows before deploying intelligent apps?

What it actually does

Why it matters

What I would do with it this week

What it actually does

Why it matters

What I would do with it this week

What Anthropic actually shipped

Why this release matters beyond finance

What I would do with it this week

What business skills actually do

Why this changes how you build internal agents

What I would do with it this week

The longer answer

How to decide in practice

Related gotchas

Frequently Asked Questions

When should I use a single agent vs multi-agent system?

How do I know if my agent has too many tools?

Why does a single agent struggle with conflicting instructions?

What are the main reasons to build a multi-agent system?

What Anthropic actually announced

Why this matters for enterprise AI buyers

What I would do with this news this week

What it actually does

Why it matters

What I would do with it this week

What you see on the surface

The underlying mechanism

Where it breaks

What this means for how you build it

Frequently Asked Questions

Why does my Power Platform center of excellence setup stop working after a few weeks?

What licences and permissions does the CoE Starter Kit service account need?

How do I know if my CoE sync flows have stopped running correctly?

Why does the CoE Starter Kit struggle with throttling on large tenants?

Determinism and predictability

Cost per execution

Maintenance and brittleness

What the work actually looks like

Governance and auditability

Choose RPA if / Choose AI if

Frequently Asked Questions

What is the difference between RPA vs AI automation for enterprise workflows?

When should I use RPA instead of an AI agent?

How do I know if AI automation is worth the cost for my workflow?

Why does RPA break so often in enterprise environments?