Tag: Agentic AI

  • Microsoft Discovery Is the First Real Glimpse of Domain-Specific Agent Platforms

    Microsoft Discovery Is the First Real Glimpse of Domain-Specific Agent Platforms

    Microsoft Discovery agentic RD platform sitting above Copilot Studio in the enterprise agent stack

    I came across the Azure Blog post about Microsoft Discovery expanding its preview, and it crystallised something I have been chewing on for months. Most enterprise AI conversation right now is stuck on horizontal agents. Generic copilots doing generic things across generic data. Microsoft Discovery agentic RD goes the other direction, and that direction is where the interesting architectural decisions are about to happen.

    What Microsoft Discovery Actually Is If You Skip the Marketing

    Strip the announcement language away and Discovery is a vertical agent platform shaped specifically for research and development workflows. It is not a chatbot. It is not Copilot Studio with a science skin. It is a purpose-built layer with domain primitives baked in: scientific data structures, simulation orchestration, multi-agent coordination tuned for R&D problems instead of generic enterprise tasks.

    The important word is shape. A horizontal agent platform gives you a blank canvas and a set of generic tools. A domain-shaped platform gives you a canvas where the grid lines already match the work. You give up flexibility. You gain a tenth of the build time when the shape fits.

    Why Domain-Shaped Agent Platforms Beat Generic Copilots for R&D Workflows

    I have written before about how most agentic workflows are just fancy if/then logic in a trench coat. The reason is almost always the same. Teams use a general-purpose tool to model a domain it does not understand, then spend weeks bolting domain logic on top through prompts and tool definitions.

    R&D is the perfect example. A real research workflow involves hypothesis tracking, simulation runs, candidate scoring, lineage of why a decision was made three steps ago. None of that is native to a generic Copilot Studio agent. You can build it. I have seen people try. It ends up as a fragile stack of topics, variables, and Power Automate flows pretending to be a state machine.

    A domain-shaped platform encodes those primitives directly. The agent does not need a 4000-token system prompt explaining what a candidate molecule is, because the platform already knows. That is the productivity unlock, and it is also why I think we are about to see a lot more of these.

    How This Changes the Build vs Buy Decision for Power Platform Teams

    Here is the part Power Platform people should pay attention to. The skill that matters going forward is not how well you can build in Power Automate or Copilot Studio. It is picking the right altitude for the automation in front of you.

    I keep seeing teams default to building everything in Copilot Studio because that is the tool they know. Someone wants a research assistant. Someone wants a contract review agent. Someone wants a finance close helper. All of it gets crammed into Copilot Studio topics and custom connectors, and six months later the build is brittle, slow, and three people deep in technical debt. If you are just getting started, getting started with Copilot Studio in 2026 means skipping the chatbot tutorials entirely and learning to think in terms of orchestration first.

    The decision tree is going to look more like this:

    • Is there a domain-shaped platform that already models this work? Use it. Customise on top.
    • Is the workflow generic but cross-system? Copilot Studio agent with deterministic Power Automate flows underneath.
    • Is the workflow narrow, predictable, high volume? Raw Power Automate. No agent. No reasoning layer. Just a flow.
    • Is the workflow heavy on judgment with messy unstructured inputs? Reasoning model in the orchestration layer, not the response layer. I covered this in my post on Claude as orchestration brain.

    Picking the wrong altitude is the most expensive mistake I see. Discovery is interesting precisely because it adds a new altitude that did not exist in the Microsoft stack before. R&D teams who would have been forced into Copilot Studio now have a layer that fits their work natively.

    What I Would Watch For Next in the Microsoft Agent Stack

    Discovery is the canary. R&D is just the first vertical because Microsoft has obvious customers there and the workflows are well-understood. The pattern will repeat. I would expect domain-shaped agent layers for clinical workflows, manufacturing operations, financial close, regulatory review. Each one will sit above the general-purpose Copilot stack and offer the same trade: less flexibility, much faster time to a working system.

    The thing I am watching is interoperability. Can a domain platform like Discovery call out to a Copilot Studio agent for a side task? Can a Power Automate flow trigger a Discovery workflow? If yes, the stack becomes composable and the architectural decisions get genuinely interesting. If no, we end up with another round of silos with their own latency problems and integration debt.

    For now, the practical move is to stop treating Copilot Studio as the universal hammer. In my experience, the teams who consistently ship working automations are the ones who match the tool to the shape of the work. Discovery just made that decision a little more interesting.

    Frequently Asked Questions

    What is Microsoft Discovery and how does it differ from Copilot Studio?

    Microsoft Discovery is a purpose-built agent platform designed specifically for research and development workflows, not a general-purpose copilot tool. Unlike Copilot Studio, it comes with domain-specific primitives like scientific data structures and simulation orchestration built in, so teams spend far less time engineering workarounds for R&D-specific tasks.

    How does Microsoft Discovery agentic RD improve research and development workflows?

    Because the platform already understands R&D concepts like hypothesis tracking, candidate scoring, and simulation runs, agents do not need lengthy prompts or custom-built logic to handle them. This reduces build time significantly compared to trying to model the same workflows on a generic agent platform.

    When should I choose a domain-specific agent platform over a generic one like Copilot Studio?

    A domain-specific platform makes sense when your workflows map closely to the vertical it was designed for, since the built-in primitives cut build time and reduce fragility. If your use case is too broad or does not fit the platform shape, a general-purpose tool with custom configuration will give you more flexibility.

    Why do generic agentic workflows often fail for complex enterprise use cases?

    General-purpose platforms require teams to manually encode domain logic through prompts, tool definitions, and automation flows, which produces brittle systems that are hard to maintain. When the platform has no native understanding of the domain, complexity accumulates quickly and the resulting agent is difficult to scale or debug.

    This post was inspired by Microsoft Discovery: Advancing agentic R&D at scale via Azure Blog.

  • Claude as an Orchestration Brain Is the Most Interesting Thing Happening in Enterprise AI Right Now

    Claude as an Orchestration Brain Is the Most Interesting Thing Happening in Enterprise AI Right Now

    Most of the conversation around Claude in enterprise automation circles is stuck on the wrong question. People are comparing it to GPT-4o or Gemini as a text generator, debating which one writes better emails or summarises documents more accurately. That framing completely misses what makes Claude enterprise automation orchestration genuinely interesting right now.

    The practitioners I talk to who are getting real results are not using Claude as a chatbot. They are using it as the reasoning layer that decides what to do next in a multi-step, stateful workflow. That is a different problem than answering a question, and it changes everything about where Claude fits in your architecture.

    The chatbot framing is getting in the way

    When a team says they want to “add Claude” to something, the default mental model is a chat interface. User sends message, model replies. Maybe it calls a tool or two. That is not orchestration. That is a smarter input box.

    Orchestration is what happens when you need a model to receive a complex goal, break it into sequenced steps, call different tools at different points, evaluate intermediate results, and decide whether to continue, retry, or escalate. The model is not answering a question. It is managing execution across a process that has state, has branching conditions, and has consequences if it goes wrong.

    I wrote about this problem directly in the post on agentic workflows. The LLM is not the agent. The LLM is the reasoning layer. If you treat them as the same thing, you end up bolting a model onto the response step of what is really just a structured flow. That is not orchestration. That is decoration.

    What makes Claude specifically interesting for orchestration logic

    Two things stand out when I look at how Claude behaves in multi-step contexts compared to other models at similar capability levels.

    First, instruction following under load. When you give Claude a detailed system prompt with conditional logic, constraints, tool-use rules, and output format requirements, it holds those instructions across a long session more reliably than most alternatives. With other models I have tested, instruction drift starts showing up once you push past a few thousand tokens of context. Claude handles longer, more complex prompts without silently dropping constraints mid-execution. For orchestration, where the system prompt is essentially your process logic written in natural language, that matters a lot.

    Second, the extended context window is not just about volume. It is about statefulness. A workflow that processes a contract, then a set of approval records, then a policy document, then makes a decision that references all three needs a model that can hold all of that in scope simultaneously. Losing context partway through an orchestration run means the model makes decisions with incomplete information. It does not know it has incomplete information. It proceeds confidently anyway. I have seen exactly this failure mode in Copilot Studio agents, where silent context loss leads to confident-sounding responses for tasks that were never properly evaluated.

    Where I would actually slot this into a Power Platform architecture

    I would not replace the existing orchestration layer in a Power Automate flow with a Claude prompt. That is not the use case. Power Automate is still the right place for deterministic, sequential steps with connectors, triggers, and error handling you can inspect.

    Where Claude earns its place is in the decision layer that sits above or between those steps. Think of a workflow that processes incoming requests, where each request has variable structure, ambiguous intent, and routing logic that depends on context that changes week to week. A hard-coded set of conditions in Power Automate will break the moment the business logic shifts. A Claude orchestration layer that reads the request, evaluates the current context loaded from Dataverse, and decides which downstream flow to invoke handles that variability without you rewriting conditions every time.

    In practice, I would build it as a Copilot Studio agent backed by Claude through a custom connector or direct API call, where Claude handles the reasoning and routing logic and Power Automate handles execution of the discrete steps. The agent decides. The flows act. The separation matters because it keeps your execution logic testable and your reasoning logic flexible. Before wiring any of this together, it is also worth auditing what adding Copilot to an existing app actually changes versus what it just surfaces differently.

    The governance piece from the post on enterprise Power Platform applies here too. Calling an external Anthropic API endpoint means your orchestration reasoning is leaving the tenant. That is an audit trail split and a DLP conversation you need to have before you build, not after.

    The honest constraints before you redesign anything

    Claude is not a free variable. Longer context windows mean higher token costs per run, and orchestration workflows that run hundreds of times a day will surface that quickly in billing. Model latency at high context volumes is also real. If your process requires sub-second decisions, this is not your tool.

    The other constraint is testability. When your orchestration logic lives in a system prompt rather than a flow diagram, reproducing a failure is harder. The model made a bad routing decision on Tuesday afternoon. Why? You need logging at the prompt level, not just at the action level. Most teams I see building this way have not set that up, and they hit the same silent failure problem I described in the Copilot Studio testing post: everything looks fine until a real user finds the edge case.

    Claude as an orchestration brain is a genuinely different capability than what most teams are building with today. The question is not whether it is smarter than the last model. The question is whether your architecture is designed to use a reasoning layer at all, or whether you are still just looking for a better chatbot to put at the front of a process that was never designed to be orchestrated.

    Frequently Asked Questions

    What is Claude enterprise automation orchestration and how is it different from using Claude as a chatbot?

    Claude enterprise automation orchestration means using Claude as the reasoning layer that manages multi-step workflows, rather than as a simple question-and-answer interface. Instead of responding to single prompts, Claude receives a complex goal, breaks it into steps, calls tools, evaluates results, and decides how to proceed. This requires stateful, branching logic that goes well beyond what a chat interface is designed to handle.

    Why does instruction drift matter when using an LLM for workflow orchestration?

    In orchestration, your system prompt acts as the process logic for the entire workflow, so if the model quietly forgets constraints or rules mid-execution, the whole process can break or produce incorrect outcomes. Some models begin losing adherence to instructions as context grows, which is a serious problem in long-running enterprise workflows. Consistency across extended sessions is one of the key reasons practitioners favour certain models for this use case.

    When should I use an LLM as an orchestration layer instead of a traditional workflow tool?

    An LLM-based orchestration layer becomes valuable when your workflow involves conditional reasoning, ambiguous inputs, or decisions that depend on synthesising information from multiple sources rather than following a fixed rule set. If your process logic can be fully mapped in advance and never changes based on context, a traditional workflow tool is likely simpler and more reliable. The LLM adds value where judgment and adaptability are required at execution time.

    How does a large context window improve multi-step enterprise workflows?

    A large context window allows the model to hold all relevant documents, intermediate results, and prior decisions in scope at once, rather than losing earlier information as the workflow progresses. This matters in processes that require a final decision to reference multiple earlier inputs, such as reviewing a contract alongside approval records and a policy document. Losing that context mid-run can lead to decisions that are inconsistent with earlier steps in the same workflow.

  • Agentic Workflows Are Not Just Fancy Automation

    Agentic Workflows Are Not Just Fancy Automation

    The mistake I keep seeing

    A client comes in. They’ve heard about AI agents. They want to ‘add AI’ to their approval workflow. So they take the existing 10-step Power Automate flow, stick a Copilot Studio agent somewhere in the middle, and call it an agentic workflow.

    It isn’t. It’s just the old process with a chatbot attached.

    This is the most common mistake I see right now, and it’s costing teams time and credibility. The agent becomes a fancy input form. The process stays broken. And when it fails — and it does — everyone blames the AI.

    What actually makes a workflow agentic

    An agentic workflow is not about adding a language model to a flow. It’s about giving the system the ability to reason about what to do next, not just execute a predefined sequence.

    The difference matters. In a traditional flow, you define every branch. Every condition. Every outcome. The machine follows instructions. In an agentic workflow, the agent interprets a goal, decides which tools or actions to use, and adjusts based on what it gets back.

    That requires a fundamentally different design approach. You’re not mapping steps — you’re defining boundaries, tools, and acceptable outcomes.

    Three things that have to change in your process design

    • Stop thinking in sequences. Agentic workflows are goal-driven, not step-driven. Define what done looks like, not every micro-step to get there. If your flow diagram looks like a subway map, you’re still in traditional automation mode.
    • Give the agent real tools, not just data. An agent that can only read a SharePoint list and send an email is not doing much reasoning. It needs to call APIs, query systems, write back to records, trigger sub-flows. Tool design is where most implementations fall apart — people give agents access to everything or nothing. Neither works.
    • Build in failure handling at the goal level. Traditional flows handle errors at the step level — if this action fails, go here. Agentic workflows need you to think about what happens when the agent reaches a dead end, produces a low-confidence result, or loops without resolution. I’ve seen agents spin for 40 iterations on a task that should have escalated to a human after three.

    Where this actually works in business processes

    Not everywhere. I want to be direct about that.

    Agentic design makes sense when the process has variability that you cannot fully predict upfront. Invoice exceptions. Complex customer complaints. Multi-system data reconciliation where the right answer depends on context you only know at runtime.

    It does not make sense for processes that are well-defined and stable. If your purchase order approval follows the same 6 steps every time, a standard Power Automate flow is the right tool. Don’t add an agent to it just because you can.

    The teams that get the most out of agentic workflows are the ones who identify a process where exceptions are eating their staff’s time, then let the agent handle the exceptions rather than replacing the whole flow.

    The orchestration layer nobody talks about

    When you start running multiple agents — one for document processing, one for customer communication, one for system updates — you need something coordinating them. This is where I see projects go sideways fast.

    In Copilot Studio and Power Platform, you can build orchestrating agents that hand off to specialist agents. But the handoff logic, context passing, and failure recovery across agents is not something the platform handles automatically. You have to design it. Most tutorials skip this. Then your multi-agent setup breaks in production because Agent B has no idea what Agent A already tried.

    Document your agent boundaries explicitly. What does each agent know? What can it do? What should it never do? Treat it like designing a team of junior staff who are fast and tireless but have no common sense unless you’ve given them the right context.

    Start smaller than you think you should

    Pick one process. One that has clear exceptions, high manual effort, and a measurable outcome. Build the agent, give it two or three tools, test it against real historical cases before you deploy it anywhere near live data.

    The teams that succeed with agentic workflows in 2026 are not the ones with the biggest ambitions. They’re the ones who are rigorous about scope, honest about where the agent is making decisions versus guessing, and fast to pull the agent out of the loop when something looks wrong.

    Agentic is a design philosophy. Apply it where it earns its complexity.