What is a context graph?

A context graph connects the entities in your company's systems (people, accounts, claims, decisions) and the relationships between them, so an AI can walk a path from a question to an answer and show the path it walked.

The shorthand is points, paths, and predictions. Entities are the points, relationships are the paths, and a prediction is what comes back when an agent walks the paths and reports what it found.

The term is young. It shows up in investor essays and vendor decks wearing a different shape each time, which usually means something real is forming and nobody has pinned it down yet.

What a context graph is made of

Start with the nodes. Every entity your systems of record already track becomes one: a candidate in the ATS, an employee in the HRIS, a customer account in the CRM, a claim, a policy, a screening rule, a decision someone made last March. Entities that appear in several systems collapse into a single node. The person who applied in 2021, was hired in 2022, and now leads her territory is one node with one history, even though three databases each hold a third of her story.

Then the edges. An edge is a typed, time-stamped relationship: applied to, was interviewed by, reported to, called, churned, approved, overrode. Edges are where the graph earns its keep, because relationships are the one thing enterprise systems refuse to store across their own boundaries. The ATS knows who applied. The HRIS knows who performed. Neither holds the edge between those two facts, and that edge is where every interesting question lives.

Last, provenance. Every edge carries its source: which system, which record, what timestamp, what confidence. Provenance is what separates a context graph from a clever join. When an agent walks from a question to an answer, the provenance on each edge it crossed is the receipt.

Time runs through all of it. Because every edge is stamped, the graph can be queried as of a date: what was known on the day a decision was made, as opposed to what is known now. Audits turn on that distinction. A system that can only answer from the present tense cannot explain a decision from last spring.

How one gets built

The graph database is the easy part. The hard part is agreeing on what the fields mean. Our approach is a canonical glossary of schema definitions and templated connectors for the systems enterprises already run: Workday, Salesforce, SAP SuccessFactors, Oracle. An AI layer matches the fields it discovers against the glossary and attaches a confidence band to every match. Anything uncertain goes to a human to verify and accept. Nothing uncertain maps silently, and a field that gets renamed or drifts stops flowing instead of being guessed at.

That discipline is what keeps the provenance honest. A graph whose edges were inferred by an unsupervised matcher is a graph whose receipts you cannot trust.

Context graph vs vector database

A vector database retrieves lookalike text. You embed your documents, embed the question, and get back the chunks that sit closest in embedding space. For plenty of work that is enough. Ask a model to summarize a policy document and similarity search finds the right pages.

It breaks on questions whose facts do not resemble each other. "Which screening rule cost us the most revenue last year?" has no chunk that contains it. The answer lives in a chain: a rule in one system rejected candidates, some of them were hired anyway through exceptions, the exceptions out-produced the rule's survivors, and the gap has a dollar value sitting in a third system. No two links in that chain sound alike, so no similarity search will assemble it.

A context graph answers by traversal. Start at the rule, walk to the candidates it rejected, walk to the exceptions, walk to their production numbers, aggregate. The result returns with the chain of hops that produced it, and every hop names its source.

In practice the two are complements. An embedding is a good way to find your entry point into the graph. The reasoning happens on the edges.

Context graph vs knowledge graph

Knowledge graphs are old technology in the best sense. Google shipped one in 2012, and pharma companies and intelligence agencies have run entity-relationship ontologies for decades. If you have built one, the bones of a context graph will look familiar. Three things are different.

What goes in. A knowledge graph holds an organization's curated facts. A context graph also holds its operational exhaust: decisions, overrides, exceptions, and outcomes enter as first-class nodes alongside the entities. What the company decided, and what happened next, become structure you can traverse. That is what lets the graph answer questions about its own judgment, which no ontology of facts can do.

Who reads it. Knowledge graphs were built for analysts running queries on demand. A context graph is built for agents reasoning over it in the background, and that forces governance into the structure itself. Which agent may see which edge is a property of the edge. Every traversal can be logged. Microsoft Research's GraphRAG made the retrieval half of this case, showing that a model retrieving over graph structure outperforms flat-chunk retrieval on questions whose facts are dispersed across a corpus. The governance half is what regulated enterprises require on top, because the reader is no longer an analyst with intuition but an agent with authority.

How fresh it stays. An ontology gets curated on a schedule. A context graph updates as the operational systems change, because an agent reasoning over last quarter's graph is reasoning about a company that no longer exists.

The path is the evidence

When an AI system recommends something consequential, a hire or an escalated claim, the first question a risk officer asks is some version of "show me how you got there." A bare probability fails that conversation. A citation list fails it too, because citations show what the model read and say nothing about how it reasoned.

A traversal survives the conversation. The path through the graph is the derivation: these records, joined by these relationships, from these systems, at these timestamps, produced this recommendation. Log the traversal and you have a Decision Trace, a queryable record of what happened, where, why, what the reasoning was, and what input any human gave. Regulated workflows take a second signer before anything executes.

Compliance teams have been asking vendors for this property for years, mostly without a word for it. They do not want a better model. They want an answer that still holds up in front of an auditor years after the decision was made, and a score gives them nothing to re-walk.

A context graph over talent systems

Talent is a clean test of the idea because the relationships are scattered across more systems than almost any other function. The CRM holds call transcripts, the richest record of how producers perform in the field. Performance data, what happened after the hire, sits in the HRIS. The ATS holds candidate records, who applied and what survived screening. Connect the three into one graph and you hold the causal chain from application to revenue, a chain no single system has ever been able to see.

A Talent Context Graph is a context graph built over talent systems. Same structure, narrower domain. The full node and edge taxonomy, what becomes a node, what becomes an edge, what a year of accumulated decisions lets you query, is laid out in the applied piece, and I will not repeat it here.

The structure holds in production. At a Fortune 500 insurance carrier, the graph spans four years of production data covering 10,765 agents, with 850,000+ applicants scored over that period. Every recommendation that surfaces from it ships with its trail. The methodology behind the trail is published on arXiv: Decision Traces.

Where the graph sits in the stack

The Nodes stack has three parts: context graph, context retrieval, proactive agents. The graph is the bottom layer, the one this piece defined. Retrieval assembles the slice of the graph an agent needs for a given decision, in the right structure and order. The agents reason over that slice continuously, and when one finds something worth doing, it drafts a cross-system workflow with the cost of action and the cost of inaction attached. A human approves, edits, or declines it. Then the system acts.

The strategic argument lives one level up: model quality stopped being the bottleneck, and the durable advantage has moved to the context layer. That case is the flagship piece. This one had a smaller job. When someone asks what a context graph is, the answer is points, paths, and predictions, with a receipt on every edge.

Saad Bin Shafiq is the founder of Nodes, serving data-sensitive enterprises. Methodology: Decision Traces.