Conversational Flows in Agentic AI: Lessons from Running This at Scale

Originally published on www.linkedin.com

This post was originally published on LinkedIn: Conversational Flows in Agentic AI: Lessons from Running This at Scale.

CrewAI recently announced the experimental feature Conversational Flows as a way to balance determinism with AI-driven reasoning in agentic systems (announcement). It’s a design problem worth taking seriously, and I want to share what we’ve learned building and running exactly this kind of system in production with enterprise customers across banking, telco, and other regulated industries.

The actual problem

A conversational AI agent needs to handle open-ended input: a customer message that could mean a dozen different things, requiring context from multiple systems, potentially triggering compliance-sensitive actions. You can’t pre-enumerate every branch. But you also can’t hand full autonomy to an LLM for a process that touches things like a loan approval or an insurance issuing.

The instinct to introduce flow control into agent architectures is right. The question is what that structure should look like for a production-grade agent — thinking of agents running millions of cases in regulated environments.

How BPMN fits into this

BPMN (Business Process Model and Notation) is an ISO-standard for process models — executable, auditable, readable by both engineers and business stakeholders. If you associate it with heavyweight enterprise BPM suites from the 2000s, that’s a fair reaction to the tooling of that era, not to the notation itself. The tooling has caught up.

Camunda has been running BPMN-based process execution since 2013. That history matters here: we didn’t build a new agent runtime and retrofit process concepts onto it. The agent layer sits on top of infrastructure that has been handling millions of concurrent instances in the world’s most critical core business processes — including durable state and failure recovery in production for over a decade.

Agentic orchestration: adding conversational agents to BPMN

The way we model conversational agents in Camunda is through a BPMN construct called the ad-hoc subprocess. This is the boundary between deterministic process logic and AI reasoning.

Here’s an exemplary loan support agent, opened in the Camunda Modeler:

The loan support agent in Camunda Modeler, showing the ad-hoc subprocess with agent tools and the LLM configuration in the properties panel

The loan support agent in Camunda Modeler. The ad-hoc subprocess (purple star) contains the agent’s tools. The LLM, system prompt, and model configuration are visible in the properties panel.

It helps to name the parts, because “agent” means different things to different people:

  • The agent is the ad-hoc subprocess (the purple star). It’s the reasoning space.
  • The LLM is configured in the properties panel: model provider, token limits, temperature. Here it’s Claude Sonnet.
  • The system prompt is plain text, visible right in the modeler. Nothing hidden behind the scenes.
  • The tools are the boxes inside: pre-configured BPMN activities the agent can invoke, such as asking the customer, querying a knowledge base, consulting a loan specialist, loading available products, calculating repayments and affordability, or kicking off a full consumer loan application as its own sub-process.

Inside the subprocess, the LLM reasons over these tools and decides what to invoke and in what order. Outside it, the process is deterministic. The agent reasons, the tools act, and the process orchestrates both.

Some tools carry their own deterministic logic. Look at the “Send message” tool: before anything reaches the customer, it routes through an “Approve content” step with an explicit approved-or-rejected decision. That approval is not a polite request in the prompt. It’s a structural step the agent cannot skip.

The "Send message" tool expanded to show a deterministic approval flow the agent cannot bypass

The “Send message” tool contains a deterministic approval flow. The agent cannot bypass this step — it is enforced by the process, not by the prompt.

This is the critical architectural point: every tool the agent can invoke is itself a BPMN activity. The LLM selects the tools, the process executes them. Compliance checkpoints, approval requirements, and audit logging are not instructions passed to the model, they are structural constraints in the process definition. You can remove a tool from the agent’s reach entirely, or wrap it in an approval gate, without changing a single word of the agent’s instructions.

For a working, runnable version of this pattern, see the Tech Helper Agent tutorial on GitHub and the accompanying step-by-step blog post.

What changes at production scale

A few things become really problematic at scale:

State persistence across waiting periods. A banking support case might involve waiting three days for a bureau reply or a human reviewer to become available. The agent can’t hold that state in memory. In Camunda’s engine, state is persisted at every step. When the external reply arrives, the process resumes exactly where it stopped, with full context intact. You can run millions of instances concurrently without the waiting ones consuming compute resources.

Failure recovery. If a tool call fails, the engine retries automatically. If retry fails, it compensates — unwinding completed steps cleanly rather than leaving the case half-done. This is table stakes for production use that most agent frameworks don’t provide.

Durable state and failure recovery in Camunda — the process persists state across waiting periods and can retry or compensate failed tool calls

Durable state and failure recovery in practice. The process persists state reliably across waiting periods of any length, and can retry or compensate tool calls that fail.

Observability. Because every step is a process activity, you get a complete execution record automatically: which tools were called, in what order, what the LLM’s context was, where a human intervened. When something goes wrong, or when a regulator asks, the answer is already in the log.

The Camunda Operate Operations Log showing a centralized, queryable audit trail of every user and system action across the process

The Camunda Operate Operations Log — a centralized, queryable audit trail of every user and system action across the process. When a regulator asks what happened, the answer is already here.

None of these are easy to bolt onto an agent framework after the fact. They’re the infrastructure you want to inherit, not build!

Numbers from production

NatWest uses this architecture for fraud investigation agents that handle cases end to end — querying customer data, cross-referencing transaction patterns, escalating to human investigators when warranted. 21 minutes saved per case, with a full audit trail.

Halkbank processes free-format customer orders through document agents: OCR, confidence scoring, routing for human review when confidence is low, automatic execution when it’s high. 50,000 transactions per day, processing time down from 54 seconds to 9 seconds.

EY runs trade exception management agents for discrepancies that no rule-set could anticipate. 7x faster than their previous process.

These aren’t pilot numbers. They’re from agents that have been running in regulated environments for over a year.

On the BPMN scepticism

If you’re building agents with CrewAI or LangChain and thinking “I don’t want to drag heavyweight BPMN into this” — I can understand that. But the design problem Conversational Flows is addressing (how do you enforce structure around an LLM’s reasoning without constraining it entirely) is exactly what BPMN and the ad-hoc subprocess pattern solves. It is battle-tested, available today, and can guarantee governance that survives adversarial inputs, edge cases, and operational incidents at scale.

It also works with CrewAI

And the best is: None of this requires abandoning what you’ve already built. Camunda is not a competing agent framework. Via MCP and A2A protocols, you can run CrewAI agents inside a Camunda process as sub-agents. The agents you’re building now can be wrapped in an orchestration layer later — without replacing them.

If you’re evaluating what Conversational Flows looks like in a production-hardened form, we’re happy to show you the running system.

camunda.com/orchestrate/agents

Bernd Ruecker
Bernd Ruecker Thoughts on all things orchestration, long running processes and developer-friendly automation technology.