What 'agentic' should mean to a buyer

An agentic AI system in an enterprise acts on your behalf, on its own initiative, without waiting for you to ask. It reads continuously across the systems you run, finds work that needs doing, and brings it to you already drafted. That is the definition. Every vendor now claims it, across a dozen different architectures, and most of those architectures do not meet the definition.

So the word still carries a meaning, and the buyer's job is to test the product against it. Two questions do that work. Does this system meet the definition? And is a system that meets the definition the thing you want, or do you want one that only sounds like it does?

Senior people leaders at regulated enterprises are direct about this. Agentic AI should be clear. Theatrically naming agents, branding them individually, giving them personas: that reads as complexity, not capability. The actual question behind the word is whether the system does work or waits to be told to do work.

That distinction is testable. Three properties, and you can check for all three in a demo.

The three properties

A system that earns the word "agentic" proposes work without being asked, carries a signed trace on every action, and waits for human approval before executing anything.

Each property is load-bearing. The absence of any one of them is not a design tradeoff. It is a different category of product.

It proposes. A reactive system answers questions. A chatbot answers questions. A search engine answers questions. An agentic system reads continuously across the data you produce and arrives with something already drafted: a proposed workflow, a flagged decision, a recommended action, with the reasoning attached and the cost of acting versus the cost of waiting both stated. You did not ask. The system found work that needed doing and brought it to you.

This is the proactive-versus-reactive axis, and it is the most important of the three. Proactive means the system is running in the background, continuously, reasoning over the current state of your data, surfacing work that needs human attention. Reactive means it waits for a prompt. The product can be impressive in either mode. The word "agentic" belongs to the first.

In practice, this is the first thing to check in a demo. Ask the vendor to show you something the system surfaced without being prompted. A proposed workflow, a flagged anomaly, a drafted recommendation that arrived on its own. If the demo requires the buyer to initiate each request, the product is a reactive system with a good interface. The word does not fit.

It traces. An agentic system that proposes something without showing its reasoning is asking for trust. In a regulated enterprise, trust is not a governance model.

A trace is not a log. A log records what happened: the timestamp, the action taken, the output produced. A trace records what the system was reading and reasoning over when it decided to act. What data it weighed. What it considered. What it proposed. What the human did with the proposal. Every step, in order, queryable after the fact.

The difference matters to an auditor. A log answers whether the system took an action. A trace answers why the system took the action, on what evidence, and who reviewed it. A governance model that can only answer the first question will not survive the second year of a regulated deployment. The vendor who can pull any decision from the last twelve months and show the full trace in one session is running an architecture built for inspection from day one. The vendor who needs a follow-up call to produce the same record is running governance as a feature layer.

It waits. This is the property vendors most often soften. An agentic system that acts before a human approves is an autonomous system, and regulated enterprises are not ready to operate autonomous AI, for structural reasons that will not change this quarter.

The approval gate is what separates a fast governed system from a risky one. The system drafts the workflow. The system prices the action and prices the inaction. A human reads both. Then the human approves, edits, or declines. Then, and only then, the system acts. The speed advantage is in the drafting, not in removing the review.

For regulated workflows, one approval is often not enough. Two humans, two signatures, before the workflow executes in any downstream system. Banks have run payments on dual authorization for decades. AI systems making consequential decisions in regulated environments should run the same control. The second signer mechanism is the architecture that makes "human in the loop" auditable rather than ceremonial.

Where the word gets stretched

The version of "agentic" that most vendors sell has agents with names and individual brands. A sourcing agent. A screening agent. A scheduling agent. Six of them, each with a distinct identity, collectively described as an agentic workforce.

The naming is decoration. It tells a buyer nothing about whether the system proposes, traces, or waits. Those properties live in the architecture. The marketing layer cannot produce them.

Senior people leaders say this plainly: what they want is a system that does work inside the workflows they already run, without requiring them to manage a cast of named AI characters. The buyers who run the hardest procurement reviews are looking for mechanisms. They have been through enough vendor presentations to know the difference between a brand and an answer.

The same three properties survive the decoration. A named agent that cannot produce a spontaneous proposal is still reactive. A branded agent whose trace lives in a follow-up call is still ungoverned. None of this means the product is bad. It means the word does not fit.

The architecture behind the test

When all three properties are present, the loop looks the same regardless of industry.

Agents read continuously across the systems of record: the HRIS, the ATS, the CRM. They reason over the current state of the data. They draft workflows, each one carrying the cost of acting and the cost of waiting. A human reads the proposal. Edits it, declines it, or approves it. Then the system acts across the systems it read from, and every action ships with its signed record.

The orchestration layer is what makes this sustainable at scale. The agents do not form an open population. Thirteen agents, three pillars, one calibrated model, and the interaction graph between them is designed, finite, and inspectable. There is no emergent behavior to discover, because there was no freedom to emerge.

At a Fortune 500 insurance carrier, this architecture ran through a legal review in 17 days and reached production in 34 days, at a carrier that had spent eighteen months rejecting six prior vendors on architecture. A sandbox evaluation produces different numbers, because it never meets a real security and compliance review. The governance that produces the 17-day approval and the governance that produces the trace, the approval gate, and the spontaneous proposals are the same thing. One architecture.

The full governance argument is in what control model lets you move that fast. The council questions that surface architecture rather than governance theater are in what an AI council should ask.

The test as a procurement tool

Three questions, asked in the room, reveal the architecture.

Ask the vendor to show you something the system proposed without a prompt. If they can, the product is proactive. If they need to set up a demonstration where a human instructs the system to analyze something, the product is reactive, and the label does not fit.

Ask to see the trace behind the proposal. The actual record, queryable by your team, showing the reasoning chain from data to recommendation. If the vendor can produce it in the demo, the trace exists in the architecture. If it requires a follow-up session, it exists in the monitoring stack. Those are not the same architecture, and the difference matters when an auditor shows up with a specific decision to explain.

Ask to see a declined recommendation. An approval gate that never shows a decline is not being used. An approval gate that shows a pattern of edits, declines, and resubmissions is the gate a regulated environment needs. The vendor who cannot show a declined recommendation from production has either built a system where declining is difficult, or built a system where the proposals are not being read.

The word "agentic" will continue to mean whatever vendors need it to mean for as long as buyers do not have a test. The three properties are the test. Any vendor willing to be held to all three in a live demo is worth continuing the conversation. Any vendor who hedges on any one of them is telling you something the pitch deck left out.

Saad Bin Shafiq is the founder of Nodes. Anchor pilot: Fortune 500 insurance carrier, four years of production data, 10,765 agents. Methodology: Decision Traces.

That distinction is testable. Three properties, and you can check for all three in a demo.

The three properties

A system that earns the word "agentic" proposes work without being asked, carries a signed trace on every action, and waits for human approval before executing anything.

Each property is load-bearing. The absence of any one of them is not a design tradeoff. It is a different category of product.

It traces. An agentic system that proposes something without showing its reasoning is asking for trust. In a regulated enterprise, trust is not a governance model.

Where the word gets stretched

The naming is decoration. It tells a buyer nothing about whether the system proposes, traces, or waits. Those properties live in the architecture. The marketing layer cannot produce them.

The architecture behind the test

When all three properties are present, the loop looks the same regardless of industry.

The full governance argument is in what control model lets you move that fast. The council questions that surface architecture rather than governance theater are in what an AI council should ask.

The test as a procurement tool

Three questions, asked in the room, reveal the architecture.

Saad Bin Shafiq is the founder of Nodes. Anchor pilot: Fortune 500 insurance carrier, four years of production data, 10,765 agents. Methodology: Decision Traces.