The Roval AI agent governance framework: 8 pillars for making AI agents knowable, governable and accountable

Q: Do I need to replace my existing governance program?

No. This framework is designed as an extension layer, not a replacement. Your existing investment in enterprise GRC, AI risk management and compliance infrastructure remains valuable. The eight pillars add the agent-specific controls that existing programs lack.

Traditional AI governance was built for models. Agents need something different.#

Traditional AI governance governs models that produce outputs for humans to review. AI agents don’t wait for review. They act, chain decisions, use tools and drift over time. That difference breaks every existing compliance framework.

This article lays out eight governance pillars built specifically for agents. They extend what you already have (NIST AI RMF, ISO 42001, EU AI Act compliance) with the controls that agents require: inventory, risk classification, policy enforcement, continuous certification, runtime observability, human-in-the-loop escalation, compliance mapping and lifecycle management.

But first, the scale of the problem.

According to Gravitee’s State of AI Agent Security 2026 report, more than 3 million AI agents are operating within corporations. Only 47.1% are actively monitored. That’s an estimated 1.5 million agents running without oversight, accessing sensitive data and connecting to critical systems with no audit trail.

Microsoft’s Cyber Pulse report puts it differently: 80% of Fortune 500 companies deploy active AI agents, but only 6% have what Microsoft classifies as “advanced” AI security strategies. Three in ten employees admitted to using unsanctioned agents at work.

Gartner predicts 40% of enterprise applications will feature task-specific AI agents by end of 2026, up from less than 5% in 2025. And by 2030, 50% of agent deployment failures will be due to insufficient governance enforcement.

The agents are here. The governance isn’t.

Why existing governance fails for agents#

Existing AI governance sits between the model’s output and the action. The model recommends; the human decides. That assumption breaks with agents in four ways.

Agents act autonomously. A traditional LLM generates a response. An agent generates a response, calls an API, writes to a database, sends an email and triggers a downstream workflow, all before a human knows it happened. A World Economic Forum analysis from March 2026 notes that AI agents involve model providers, orchestration platforms, extension developers, enterprises and end users. Accountability is diffuse unless roles are clearly defined.

The World Economic Forum argues that visibility into agent behavior is becoming critical for governance at scale. World Economic Forum

Agents chain decisions. A single agent interaction can involve dozens of sequential decisions, each building on the last. When something goes wrong at step 14, tracing accountability back through the decision chain requires a level of observability that most organizations simply don’t have.

Agents use tools. They invoke APIs, query databases, browse the web, execute code and interact with other agents. Each tool invocation is a new risk surface. The OWASP Top 10 for Agentic Applications (2026) identifies tool misuse, goal hijacking and supply chain vulnerabilities as top-tier risks specific to agents.

Agents drift. Model updates, data distribution shifts and evolving business contexts mean agent behavior changes over time, even without anyone modifying the agent. A governance certification from January that isn’t revisited until June is, as Zenity’s Chris Hughes has argued, not governing anything by month three.

Forrester’s 2026 survey of 500 enterprises found that 71% lack a formal governance framework for autonomous agents, even as 64% plan to increase agent autonomy within 12 months.

The 8 pillars of AI agent governance#

Governance starts with visibility. Classification drives security. Compliance requires mapping. These three capabilities form the foundation of the Roval AI agent governance framework.

This framework layers on top of your existing AI governance program. If you’ve invested in enterprise GRC and built on NIST AI RMF or ISO 42001, that foundation still matters. The eight pillars extend it to address what agents introduce: autonomy, tool use, multi-agent interaction, dynamic data access and real-time decision-making.

Pillar 1: Agent inventory and discovery#

You can’t govern agents you don’t know exist.

Every enterprise has a CMDB for servers. A subscription manager for SaaS. An asset inventory for endpoints. But when it comes to AI agents, most enterprises have a Slack channel and a spreadsheet. Maybe.

Agent identity (unique ID, name, description, version), owner and accountability chain, creation and last modification date, deployment model (homegrown, endpoint, SaaS-embedded), tools and APIs the agent can access, data sources and classification, operational status (active, paused, retired, under review), framework (LangGraph, CrewAI, Copilot Studio, etc.) and upstream and downstream dependencies.

Agent inventory is the foundation of everything that follows. It means building a comprehensive registry of every agent operating within the enterprise, whether it was sanctioned by IT or spun up by a marketing team on a Friday afternoon using a no-code tool.

This is the “system of record” concept. Just as Vanta became the system of record for compliance and ServiceNow became the system of record for IT service management, enterprises need a centralized registry for AI agents: the CMDB equivalent for AI agents.

Without it, every subsequent governance pillar is built on guesswork.

What's New in Zurich: AI Control Tower Demo. ServiceNow

Pillar 2: Risk classification and tiering#

Not all agents carry the same risk, but most enterprises treat them as if they do.

Once you have an inventory, every agent needs a risk classification. The classification drives everything downstream: what level of oversight it requires, what compliance obligations apply, what monitoring cadence is appropriate.

We recommend a three-dimensional risk model:

Dimension 1: Data sensitivity. What data does the agent access? Public data, internal business data, PII, PHI, financial records or classified information? An agent summarizing public blog posts carries different risk than one processing customer medical records.

Dimension 2: Decision authority. What can the agent do? Read-only agents have limited blast radius. Agents that can execute financial transactions, modify production infrastructure or send external communications carry higher risk. Researchers Engin and Hand call this “process autonomy” in their dimensional governance model. Autonomy isn’t binary but a spectrum that shifts based on context.

Traditional categorical governance frameworks (based on fixed risk tiers, levels of autonomy or human oversight models) are increasingly insufficient on their own.

Dimension 3: Blast radius. If this agent fails catastrophically, what’s the scope of impact? A single user? A team? An entire customer base? A regulated population?

These three dimensions map naturally to four risk tiers:

Tier	Data sensitivity	Decision authority	Blast radius	Example
Critical	Regulated data (PII, PHI, financial)	Autonomous execution of high-impact actions	Organization-wide or customer-facing	Trading agent, clinical decision support
High	Sensitive internal data	Autonomous execution with guardrails	Department or business unit	HR screening agent, procurement automation
Medium	Internal business data	Recommendations with human approval	Team-level impact	Content drafting agent, meeting summarizer
Low	Public data only	Read-only or informational	Individual user	Knowledge base Q&A, internal search

The EU AI Act’s risk classification (unacceptable, high, limited, minimal) maps directly to this model, which we cover in detail in the compliance mapping section below.

Pillar 3: Policy definition and enforcement#

Policies written in documents don’t govern agents running in production.

Every agent needs a defined set of behavioral boundaries: what it can do, what it must never do and what requires human approval. These policies must be technically enforced, not just documented.

Policies must be enforced at three layers:

Build-time: Before an agent goes live, it must pass a set of pre-deployment checks. Does it have an owner? Has it been risk-classified? Are its tool permissions scoped to least-privilege? Has it been tested against adversarial inputs? Build-time governance is the “shift-left” equivalent for agents.

Deploy-time: At the moment of deployment, the agent must be certified against its risk tier. Critical and high-risk agents require explicit sign-off. Deployment configuration must match the approved policy. This is the “certification gate.”

Runtime: Once live, the agent’s actions are validated against its policy in real-time. If an agent attempts an action outside its approved boundaries (accessing a data source it shouldn’t, invoking a tool it’s not permitted to or exceeding a financial threshold) the enforcement layer blocks the action and escalates.

The principle is borrowed from infrastructure security: policy as code. If your governance policies can’t be expressed as executable rules, they’re aspirational, not operational.

Pillar 4: Continuous certification#

Point-in-time assessments don’t work for systems whose behavior changes continuously.

Traditional governance relies on periodic reviews: quarterly risk assessments, annual audits, point-in-time compliance checks. For AI agents, this cadence is wildly inadequate.

Agents change because models update, data distributions shift, business processes evolve and new tools get added to their toolkit. An agent certified as low-risk in March might be medium-risk by May if someone added a new API integration that gives it access to customer data.

Continuous certification means:

Automated re-assessment when an agent’s configuration changes
Drift detection that flags when agent behavior deviates from its certified baseline
Recertification triggers tied to material changes (new data sources, new tool access, model updates, ownership changes)
Expiration-based certification that automatically expires after a defined period, forcing re-review

This aligns with what Gartner predicts: by 2030, 50% of organizations will use autonomous AI agents to interpret governance policies into machine-verifiable contracts, automating compliance enforcement. The trajectory is clear: continuous, automated certification is the future.

Pillar 5: Runtime observability#

Static assessments don’t work for systems whose behavior is emergent and context-dependent. Runtime observability means logging what agents do, not just what they were designed to do.

The observability layer captures:

Action logs: Every tool invocation, API call, database query and external communication
Decision traces: The reasoning chain that led to each action, enabling post-hoc assessment
Data access logs: What data the agent accessed, when and what it did with it
Inter-agent communications: For multi-agent systems, the messages and handoffs between agents
Performance metrics: Latency, error rates, retry counts, escalation frequency
Cost tracking: Token usage, API call costs, compute consumption per agent

The EU AI Act Article 12 requires automatic recording of events for high-risk AI systems, including inputs, outputs and decision logic, with sufficient detail to enable post-hoc assessment (EU AI Act, Article 12). Building observability into agent architecture from the start is dramatically cheaper than retrofitting.

Audit logs must be queryable by agent, by workflow, by time range, by action type and by escalation status. They should be readable by governance and compliance teams but not modifiable by engineering teams.

Pillar 6: Human-in-the-loop escalation#

Human oversight at every step doesn’t scale. No human oversight at all is reckless. Blanket “human-in-the-loop” policies are unsustainable. Anthropic’s research on measuring agent autonomy found that 40% of experienced Claude Code users opt for full-auto approval mode, a signal that humans will inevitably reduce oversight when it becomes a bottleneck.

Anthropic's research found that 40% of experienced users opt for full-auto approval mode, a signal that blanket human oversight is unsustainable. < href="https://www.anthropic.com/research/measuring-agent-autonomy" target="_blank" rel="noopener">Anthropic

The answer isn’t more human oversight or less. It’s adaptive oversight that matches the risk of the action.

Tier	Oversight model	Example actions
Autonomous	No human approval required	Reading internal docs, generating summaries, searching knowledge bases
Human-on-the-loop	Human is notified, can intervene but doesn’t have to approve	Sending internal emails, updating CRM records, scheduling meetings
Human-in-the-loop	Action queued until human approves	External communications, data exports, financial transactions above threshold
Human-only	Agent recommends, human executes	Legal commitments, regulatory filings, system-wide configuration changes

These tiers should be dynamic, not static. The same agent might operate autonomously for routine tasks but require explicit approval when it encounters an edge case, accesses sensitive data or approaches a cost threshold.

Pillar 7: Compliance mapping and reporting#

Compliance requirements are multiplying, and each framework speaks a different language. Three frameworks matter most for enterprise AI agent governance:

EU AI Act (effective August 2026): The first comprehensive AI regulation. Penalties reach EUR 35 million or 7% of global annual turnover under Article 99 for the most severe violations. For AI agents, the critical requirements include risk classification (Article 6), human oversight (Article 14), technical documentation (Article 11), automatic logging (Article 12), transparency (Article 13) and accuracy/robustness testing (Article 15).

NIST AI Risk Management Framework (AI RMF): The voluntary US framework built around four functions: Govern, Map, Measure, Manage. NIST has issued an RFI specifically on securing agentic AI systems in January 2026, signaling that agent-specific guidance is coming.

The NIST AI Risk Management Framework provides the Govern, Map, Measure, Manage structure that maps directly to agent governance pillars. NIST

ISO/IEC 42001: The first international standard for AI management systems, specifying requirements for establishing, implementing, maintaining and continually improving an AI management system (AIMS).

Compliance mapping matrix:

Governance requirement	EU AI Act	NIST AI RMF	ISO 42001
Agent inventory	Art. 11 (Technical documentation)	MAP 1.1 (Context mapping)	6.1 (Planning - risk assessment)
Risk classification	Art. 6 (Classification rules)	MAP 2.1 (Risk identification)	6.1.2 (AI risk assessment)
Policy enforcement	Art. 9 (Risk management system)	MANAGE 1.1 (Risk prioritization)	8.1 (Operational planning and control)
Continuous certification	Art. 9.9 (Systematic review)	MEASURE 2.1 (Evaluation)	9.1 (Monitoring, measurement, analysis)
Runtime observability	Art. 12 (Record-keeping)	MEASURE 1.1 (Monitoring)	A.6.2.6 (Data quality for AI)
Human oversight	Art. 14 (Human oversight)	GOVERN 1.2 (Accountability)	A.8.3 (Responsible AI)
Audit and reporting	Art. 62 (Post-market monitoring)	MANAGE 4.1 (Documentation)	9.2 (Internal audit)
Lifecycle management	Art. 9.1 (Continuous process)	MANAGE 3.1 (Risk response)	10.1 (Continual improvement)

The key insight: no single framework covers everything. Enterprises need a unified governance layer that maps controls to all applicable frameworks simultaneously and generates compliance evidence automatically.

Pillar 8: Lifecycle management#

Agents are born, but they’re rarely retired.

Every IT asset has a lifecycle: provision, deploy, operate, decommission. AI agents should be no different, but in practice most enterprises have no process for agent retirement. The operational discipline required is different from MLOps. Agents don’t just predict. They act.

The agent lifecycle has five stages:

1. Ideate: Business case definition, risk pre-assessment and scope definition. Before an agent is built, someone must answer: what problem does this solve, what authority does it need and what’s the fallback if it fails?

2. Build: Development, testing, red-teaming and pre-deployment certification. Build-time governance ensures the agent meets baseline requirements before it touches production data.

3. Deploy: Certification gate, production configuration, monitoring setup and notification to the agent owner, security team and compliance lead. The agent enters the registry as “active.”

4. Operate: Runtime monitoring, continuous certification, performance tracking, cost management and periodic governance reviews. The bulk of an agent’s life is spent here.

5. Retire: Decommissioning, access revocation, data cleanup, dependency notification and archival. When an agent is no longer needed, it must be cleanly removed from the estate.

The retire stage is where most enterprises fail. Agents accumulate. Orphaned agents (where the original developer has left) continue running. Deprecated agents with outdated model versions drift in behavior. The agent estate grows, but it never shrinks.

A governance framework without lifecycle management guarantees agent sprawl. The hidden costs of agent sprawl, from unoptimized token spend to compliance exposure, compound with every orphaned agent left running.

Prompt orchestration governance: where it fits in the eight pillars#

Most agent failures don’t start inside a single agent. They start in the orchestration layer: the router, planner or prompt chain that decides which agent or tool handles a request. That layer makes decisions, holds credentials and passes context between agents, but most enterprises never register it as a governed component.

Treat the orchestrator like any other agent in the estate. Pillar 1 registers it with an owner, a risk tier and a list of the agents and tools it can dispatch to. Pillar 3 enforces policy at the dispatch point, so an orchestrator can’t hand a high-risk action to an agent that isn’t certified for it. Pillar 5 logs every routing decision, which agent was called, with what context and why, so a failure at step 14 is traceable back through the chain.

This is the same discipline you apply to a load balancer or an API gateway. The component that routes traffic needs its own access controls and its own audit trail, not a free pass because it “just” coordinates. For the agent-to-agent risks the orchestrator introduces, see multi-agent governance. For the dispatch-time controls, see policy as code.

Implementation roadmap#

The framework is designed for phased implementation. You don’t need all eight pillars on day one. Scale the scope to your agent estate. A team with four agents doesn’t need the same governance infrastructure as an enterprise running hundreds. Start with inventory and classification, then add controls as your estate grows.

Phase 1: See it (weeks 1-3)#

Focus: Pillar 1 (Inventory) + Pillar 2 (Classification)

Conduct an enterprise-wide agent discovery scan
Build the initial agent registry
Classify every known agent by risk tier
Identify orphaned agents and assign temporary owners
Establish the governance working group (cross-functional: engineering, security, legal, compliance, business)

Deliverable: Complete agent registry with risk classifications.

Phase 2: Control it (weeks 4-6)#

Focus: Pillar 3 (Policies) + Pillar 6 (Human oversight)

Define behavioral boundaries for each risk tier
Implement authorization tiers (autonomous through human-only)
Build the policy enforcement layer for critical and high-risk agents
Establish escalation paths and incident response procedures
Define the RACI model for agent accountability

Deliverable: Policy-as-code enforcement for critical agents.

Research on AI accountability structures found that enterprises with clearly defined RACI models for AI agents resolve incidents 54% faster and face 41% lower regulatory scrutiny than those with ambiguous accountability.

Phase 3: Monitor it (weeks 7-10)#

Focus: Pillar 4 (Certification) + Pillar 5 (Observability)

Deploy runtime monitoring for all high-risk and critical agents
Implement audit logging with the granularity required by EU AI Act Article 12
Build continuous certification pipelines that re-assess agents on configuration change
Set up drift detection and alerting
Create governance dashboards for leadership visibility

Deliverable: Real-time observability and continuous certification for high-risk agents.

Phase 4: Sustain it (ongoing)#

Focus: Pillar 7 (Compliance) + Pillar 8 (Lifecycle)

Map governance controls to EU AI Act, NIST AI RMF and ISO 42001
Implement lifecycle management processes (including retirement criteria)
Conduct monthly governance reviews
Run quarterly authorization reassessments
Perform annual framework audit against new regulations
Publish internal “State of the Agent Estate” report

Deliverable: Compliance-mapped, lifecycle-managed governance program.

The EU AI Act takes effect in five months#

The regulatory timeline is concrete. The EU AI Act applies from August 2026. Gartner predicts “death by AI” legal claims will exceed 2,000 by end of 2026 due to insufficient risk guardrails (Gartner Strategic Predictions 2026). The World Economic Forum wrote in March 2026 that visibility into agent behavior is now critical for accountability as deployment expands.

If you can’t answer how many agents are running in your organization, start with Pillar 1. Build the registry. Classify by risk. Everything else builds on that foundation.

The enterprises that deploy agents with confidence will be the ones that know which agents they have, what those agents are doing and who’s accountable when something goes wrong. That requires a system of record.

Frequently asked questions about AI agent governance frameworks#

What is an AI agent governance framework?#

An AI agent governance framework is the set of controls that make every agent in an enterprise knowable, governable and accountable. It extends existing programs like NIST AI RMF and ISO 42001 with agent-specific pillars: inventory, risk classification, policy enforcement, continuous certification, runtime observability, human oversight, compliance mapping and lifecycle management. The framework governs what agents do at runtime, not just what they were designed to do.

How is this different from existing AI governance frameworks?#

Existing frameworks (NIST AI RMF, ISO 42001, EU AI Act) were designed for traditional AI systems, models that produce outputs for humans to act on. This framework extends those standards to address the properties unique to agents: autonomy, tool use, chained decisions and behavioral drift.

Do I need to replace my existing governance program?#

No. The eight pillars are an extension layer, not a replacement. Your existing investment in enterprise GRC, AI risk management and compliance infrastructure stays valuable. The pillars add the agent-specific controls those programs lack.

How does this relate to the EU AI Act?#

The EU AI Act applies to AI agents, particularly those classified as high-risk. Article 6 defines classification criteria, Article 14 mandates human oversight and Article 12 requires automatic event logging. Penalties can reach EUR 35 million or 7% of global turnover. Pillar 7 maps your controls to these requirements.

What about agents I don’t control (SaaS-embedded agents)?#

SaaS-embedded agents are the hardest to govern because you have the least visibility. Update your vendor risk process to ask specific questions: what actions can the embedded agent take, what data does it access, what permission boundaries exist and can you disable or constrain it.

How many agents does the average enterprise have?#

According to Gravitee’s 2026 report, more than 3 million agents are operating across corporations globally. A reasonable estimate for a mid-to-large enterprise is hundreds to low thousands of agents across sanctioned and unsanctioned deployments.

What is agent sprawl and why does it matter?#

Agent sprawl is the uncontrolled proliferation of AI agents across the enterprise, analogous to shadow IT. It happens when teams deploy agents without central oversight. Gravitee reports an estimated 1.5 million corporate agents currently run without monitoring. The hidden costs of agent sprawl compound with every orphaned agent.

What should I do first?#

Start with inventory and classification. You can’t make governance decisions without knowing what you’re governing. Run an enterprise-wide discovery scan, build your agent registry and classify every known agent by risk tier.

How does the framework govern prompt orchestration and multi-agent systems?#

The orchestration layer, the router or prompt chain that decides which agent or tool handles a request, is itself a governed component. Pillar 1 registers it. Pillar 3 enforces policy at the dispatch point. Pillar 5 logs every routing decision. For the agent-to-agent risks it introduces, see multi-agent governance.

Where does Roval fit?#

Roval is the system of record for the eight pillars. It registers every agent with identity, owner, risk tier and compliance status. It classifies each one across data sensitivity, decision authority and blast radius. It enforces policy at build, deploy and runtime, then logs every action for audit. The framework defines the controls. Roval is where they run.

Sources and further reading#

Source	Date	URL
Gravitee, State of AI Agent Security 2026	Mar 2026	gravitee.io
Microsoft Cyber Pulse Report	Feb 2026	microsoft.com
Gartner, 40% of Enterprise Apps Will Feature AI Agents by 2026	Aug 2025	gartner.com
Gartner, Top Predictions for Data and Analytics 2026	Mar 2026	gartner.com
Gartner, Strategic Predictions for 2026	Jan 2026	gartner.com
World Economic Forum, Governance for AI Agents	Mar 2026	weforum.org
EU AI Act, Article 99 (Penalties)	Jun 2024	artificialintelligenceact.eu
EU AI Act, Article 12 (Record-keeping)	Jun 2024	artificialintelligenceact.eu
NIST AI Risk Management Framework	Jan 2023	nist.gov
NIST RFI on Securing Agentic AI Systems	Jan 2026	nist.gov
ISO/IEC 42001 AI Management System Standard	Dec 2023	iso.org
OWASP Top 10 for Agentic Applications (2026)	Jan 2026	owasp.org
Anthropic, Measuring AI Agent Autonomy in Practice	Mar 2026	anthropic.com
Zenity (Chris Hughes), Governing Agentic AI	Feb 2026	zenity.io
Engin & Hand, Toward Adaptive Categories: Dimensional Governance	May 2025	arxiv.org
Forrester/Thinking.inc, AI Agent Governance Gap 2026	Mar 2026	thinking.inc
ServiceNow, What’s New in Zurich: AI Control Tower Demo (video)	Aug 2025	youtube.com