SOC 2 for AI agents: what your auditor will actually ask
Your next SOC 2 audit will include AI#
SOC 2 audits are changing. Not because the AICPA rewrote the Trust Service Criteria. The 2017 TSC with 2022 revised points of focus remain the governing standard. But because auditors are now applying those criteria to a category of system the framework was never designed for: autonomous AI agents.
The shift is already measurable. SOC 2 benchmark data shows that reports with more than 150 security controls rose from 16% to 23% in the past year. Confidentiality is now included in 64.4% of SOC 2 reports, up from 34% in 2023. Availability appears in 75.3%. The scope is expanding because the systems in scope are expanding and AI agents are the newest addition.
As one practitioner-focused analysis put it: “Your auditor is going to ask about it. They will ask how you version your models, how you test for bias, what happens when the model hallucinates and how you prove that a probabilistic system processes data with ‘integrity.’”

If your organization deploys AI agents that access customer data, execute autonomous actions or connect to external systems, those agents are in your SOC 2 scope and the five Trust Service Criteria apply to them in ways that require agent-specific controls your existing program almost certainly doesn’t cover.
This article maps each Trust Service Criterion to the specific controls AI agents require, identifies the questions your auditor will ask and provides a 25-control checklist you can use to prepare. The downloadable PDF version (below) is designed as a pre-audit worksheet.
Why SOC 2 breaks for agents#
SOC 2 was designed for a world of deterministic systems with static controls and human-mediated access. AI agents violate all three assumptions.
Assumption 1: Controls can be documented and tested at a point in time. Traditional controls (a firewall rule, an access policy, an encryption configuration) remain effective until someone intentionally changes them. Agent behavior changes without anyone changing the agent. The underlying model updates. The data distribution shifts. A tool the agent depends on modifies its API. An agent that was compliant last month may not be compliant today and nobody touched it.
Assumption 2: Systems process information “as intended” based on documented logic. Traditional software follows deterministic logic: given input X, produce output Y. AI agents produce non-deterministic outputs that vary with context and they exhibit emergent behavior, actions arising from complex interactions rather than explicit programming. You can’t document the agent’s “intended” processing logic the way you document a database query, because the agent’s behavior is shaped by its model, its prompt, its tools and the specific data it encounters at runtime.
SOC 2 reports with more than 150 security controls rose from 16% to 23% in the past year. Confidentiality inclusion nearly doubled from 34% in 2023 to 64.4% today. Yet 97% of organizations that suffered AI-related breaches lacked proper access controls, and 33% lack audit trails entirely.
Assumption 3: Access is human-mediated. Traditional SOC 2 controls assume humans access systems and controls govern that human access. Agents access systems autonomously (calling APIs, querying databases, executing code and interacting with other agents) without a human initiating each action. The access control model must extend to non-human autonomous identities and the monitoring must cover actions that happen at machine speed.
These aren’t theoretical gaps. 97% of organizations that suffered AI-related breaches lacked proper access controls. 33% of organizations lack audit trails entirely and 61% have fragmented logs across systems, meaning the evidence your auditor needs doesn’t exist in a queryable form.
The OWASP Top 10 for Agentic Applications identifies excessive agency, improper output handling and insecure tool integration as top-tier risks, all of which map directly to SOC 2 Trust Service Criteria. If your agents can take autonomous actions, those actions need the same level of access control, monitoring and audit logging you apply to human users.
The five trust service criteria applied to AI agents#
Security (Common Criteria, mandatory)#
Security is the only mandatory TSC in every SOC 2 audit. For AI agents, the security criteria extend beyond traditional infrastructure controls to cover agent-specific access patterns.
What your auditor will ask:
- How do you authenticate agents to LLM APIs and external services? (CC6.1)
- How do you enforce least-privilege access for each agent’s tool permissions? (CC6.3)
- How do you manage API keys (rotation schedule, storage, scoping)? (CC6.1)
- How do you monitor for prompt injection and data exfiltration attempts? (CC7.2)
- How do you manage LLM providers as sub-processors? (CC6.6)
The controls you need:

Every agent must have a unique, verifiable identity, not a shared service account, not a developer’s personal API key. CC6.1 requires logical access control procedures that limit system entry to authorized individuals; for AI environments, this covers model repositories, training data and inference APIs.
Tool access must follow least-privilege. CC6.3 requires that agents operate with only the permissions required for each specific task. An agent that needs read access to a customer database should not have write access. An agent that calls one API should not have credentials for ten. Permissions must be reviewed regularly, and elevated rights should be time-bound and logged.
LLM API providers are sub-processors. CC6.6 requires documenting sub-processor security requirements, reviewing their SOC 2 reports annually and maintaining contractual security obligations. If your agents call Anthropic, OpenAI or any other model provider, that relationship needs to be in your vendor risk management program.
Prompt and completion logging is now a CC6.1 and CC7.2 expectation. Every prompt sent to a model and every response received (particularly when prompts contain PII or when outputs may contain sensitive information) must be logged with timestamps, user identity and model version.
Roval implementation: The LLM request monitor captures every prompt through a transparent proxy with under 1ms of overhead. Every request is logged with full text, model, token counts, user identity and timestamp. Policy rules evaluate violations within 30 seconds. The agent registry assigns unique identities and tracks tool access permissions per agent.
Availability (A series)#
Availability is included in 75.3% of SOC 2 reports and is critical for any agent that supports business operations or customer-facing services.
What your auditor will ask:
- What happens when an agent fails? Is there a fallback?
- Can you immediately halt an agent that’s behaving abnormally?
- What are your uptime commitments for agent-dependent services?
- How do you handle capacity planning for variable token consumption?
The controls you need:
Agents must be designed with fail-safe or fail-open behavior. If the agent can’t reach its LLM provider, what happens? If the monitoring layer goes down, does the agent stop or continue unmonitored? The architecture must define failure modes explicitly.
A kill switch is essential. Article 14 of the EU AI Act requires it and SOC 2 availability criteria (A1.3) demand recovery procedures that include the ability to halt problematic systems. The kill switch must be tested quarterly: trigger, access revocation, state preservation, notification chain.
Circuit breakers prevent cascading failures. When an agent exceeds a violation threshold or error rate, the circuit breaker trips automatically, blocking further actions until an administrator reviews and resets. This is availability governance at the agent level.
Roval implementation: The Observer’s circuit breaker auto-stops agents exceeding violation thresholds. The LLM proxy is fail-open by design: if monitoring goes down, agents continue operating without telemetry being dropped and developer sessions never break. Kill switch capability is built into the lifecycle management layer.
Processing integrity (PI series)#
This is where AI agents create the most novel challenges for SOC 2. Processing Integrity (PI1) requires that system processing is “complete, valid, accurate, timely and authorized.” For a deterministic system, these terms have clear definitions. For an AI agent, they require reinterpretation.
What your auditor will ask:
- How do you define “accurate” for a non-deterministic system?
- How do you detect hallucinations?
- How do you monitor for behavioral drift?
- What guardrails prevent the agent from taking actions outside its defined scope?
- How do you validate that outputs meet quality thresholds?
The controls you need:
Define accuracy in probabilistic terms. For classification tasks, this might mean “at least 95% accuracy on validation data, measured weekly.” For generative tasks, “outputs are factually grounded in provided context at least 98% of the time.” The key is that the threshold is documented, measurable and monitored, not aspirational.
Document acceptable output boundaries. What outputs are never acceptable? An agent processing financial data should never fabricate transaction records. An agent communicating with customers should never disclose internal pricing formulas. These boundaries must be enforced through guardrails, not just documented in a policy.
Monitor for drift continuously. Agent drift detection should include tracking output distributions, flagging when behavior deviates from the certified baseline and triggering review when thresholds are exceeded. A point-in-time assessment is inadequate. The auditor will test whether your drift detection operated effectively throughout the observation period.
Roval implementation: Policy rules define prohibited content patterns, model allowlists and prompt size limits, evaluated automatically within 30 seconds of capture. The Observer builds behavioral baselines after 30+ tool calls and highlights deviations. Drift detection runs every 15 minutes, catching certification expiry, configuration changes and behavioral anomalies.
PI1 requires system processing to be “complete, valid, accurate, timely and authorized.” For deterministic software, these terms are straightforward. For AI agents with non-deterministic outputs, they require redefinition. Accuracy must be expressed as measurable probabilistic thresholds, monitored continuously, not just at audit time.
Confidentiality (C series)#
Confidentiality is now included in 64.4% of SOC 2 reports, nearly double the 34% from 2023. For AI agents, confidentiality concerns center on what data enters the agent’s context window and what leaves it.
What your auditor will ask:
- Can agents access confidential data? Which ones, and with what controls?
- Does PII appear in prompts sent to LLM APIs?
- Can the agent’s outputs leak confidential information?
- How do you classify data sensitivity for agent-accessible sources?
- What are your retention and deletion policies for captured prompts?
The controls you need:
Data classification must extend to agent-accessible sources. Every data source an agent can query needs a sensitivity classification: public, internal, confidential, restricted. The agent’s risk tier should reflect the highest classification of data it can access.
PII in prompts requires specific handling. When agents send prompts containing customer names, account numbers or other PII to external LLM APIs, that data is leaving your environment. Controls must address whether PII is scrubbed before transmission, whether the LLM provider’s data processing agreement covers this use case and whether prompt logs are retained with appropriate access controls.
Output filtering prevents confidential data leakage. If an agent has access to confidential data, its outputs must be monitored for inadvertent disclosure: customer data appearing in summaries, internal pricing in customer-facing communications or proprietary information in external API calls.
Roval implementation: The agent registry’s risk classification includes a data sensitivity dimension (public / internal / confidential / restricted) that drives governance requirements. The LLM proxy captures full prompt text with configurable response capture (opt-in, PII-scrubbed). Retention is 90 days with export before deletion.
Privacy (P series)#
Privacy criteria apply when agents process personal information, which is most enterprise agents. The Privacy TSC aligns closely with GDPR requirements, making it particularly relevant for European enterprises.
What your auditor will ask:
- Does the agent process personal data? Have you disclosed this to data subjects?
- How do you handle data subject access requests for data processed by agents?
- What data minimization principles apply to agent prompts?
- How long is personal data retained in agent logs?
The controls you need:
Privacy notices must disclose AI processing. If an agent makes decisions about individuals (eligibility assessments, risk scoring, customer routing), the individuals must be informed that AI is involved in the process.
Data minimization applies to prompts. Agents should receive only the personal data necessary for their task, not the customer’s entire profile when only their account status is needed. Over-provisioning context is both a privacy risk and a cost issue.
Retention limits must cover agent logs. If your LLM request logs contain personal data (customer names in prompts, account numbers in context), those logs are subject to your data retention policy. The 90-day retention period common in logging infrastructure may or may not align with your privacy obligations.
Roval implementation: The compliance certification workflow supports GDPR as a built-in framework alongside SOC 2, with per-requirement evidence tracking. The agent registry’s regulatory exposure dimension flags agents subject to privacy obligations. Audit trail exports (CSV/JSON) support data subject access requests by filtering for specific individuals.
The 10 questions your auditor will ask#
Based on practitioner reporting and the evolving audit environment, here are the ten questions you should prepare for:
- How many AI agents are operating in your environment, and where is the inventory (CC6.1)
- How do you authenticate and authorize agent access to systems and data (CC6.1, CC6.3)
- How do you manage API keys for LLM providers and what’s the rotation schedule (CC6.1, CC6.6)
- How do you validate processing integrity for a system with non-deterministic outputs (PI1.1)
- How do you detect and respond to model drift or behavioral changes (CC8.1, PI1.3)
- Can you produce an audit trail showing what an agent did on a specific date (CC7.1, CC7.2)
- What’s your incident response plan for an agent-specific failure (CC7.3, CC7.4)
- How do you handle PII in prompts sent to external LLM providers (C1.1, P1.1)
- Which LLM providers are sub-processors and where are their SOC 2 reports (CC6.6, CC9.2)
- Can you immediately halt an agent that’s behaving abnormally (A1.3)
If you can answer all ten with documented evidence, you’re ahead of most organizations. If you can’t, the 25-control checklist below provides the roadmap.
The checklist maps all 25 controls (15 pre-deployment, 10 post-deployment) to specific Trust Service Criteria codes so your compliance team can integrate them directly into your existing control matrix.
25 controls mapped to Trust Service Criteria codes. A pre-audit worksheet for compliance teams managing AI agent deployments.
Continuous compliance vs. point-in-time: why Type II changes everything#
SOC 2 Type I tests whether controls are designed effectively at a single point in time. Type II tests whether they operated effectively throughout the observation period, typically 6 to 12 months. The difference matters enormously for AI agents.
SOC 2 Type II requires extended observation, with minimum three-month windows for first-time engagements. Auditors will sample evidence from across the entire period. If your drift detection was configured in month 1 but failed silently in month 4, the auditor will find the gap.
For AI agents, this means governance must be continuous, not periodic. Certification expiry dates force re-review on a defined cadence. Drift detection runs automatically and generates evidence as a byproduct. Audit logs accumulate continuously and are immutable. Access reviews happen quarterly, not annually.
The organizations that pass Type II audits with AI agents in scope are the ones that build continuous evidence generation into their agent governance infrastructure from day one, not the ones that scramble to reconstruct evidence in the weeks before fieldwork begins.
AI agents need to be treated as first-class identities in your security infrastructure. The same rigor you apply to human access (authentication, authorization, monitoring, least privilege) must extend to every autonomous agent operating in your environment.
Roval’s architecture is designed for exactly this. Certifications auto-expire by risk tier (90 days for Critical, 180 for High, 365 for Low). Drift detection runs every 15 minutes and creates timestamped alert records. Every state change (registration, classification, certification, configuration change, ownership transfer) is recorded in an immutable audit log. The complete trail exports as CSV or JSON, filtered by resource, actor, action or date range. When the auditor arrives, evidence isn’t prepared; it’s exported.
The 25-control checklist#
We’ve published a comprehensive SOC 2 Audit Readiness Checklist for AI Agents as a downloadable PDF. It covers 15 pre-deployment controls and 10 post-deployment controls, each mapped to specific Trust Service Criteria codes.
SOC 2 Type II requires minimum three-month observation windows. Auditors sample evidence from across the entire period. If your drift detection was configured in month 1 but failed silently in month 4, the auditor will find the gap. Continuous evidence generation must be built into agent governance infrastructure from day one.
Pre-deployment controls span three categories: agent identity and access (unique IDs, ownership, least-privilege tool access, API key management, role-based access), risk classification and governance (multi-dimensional classification, compliance framework mapping, human-in-the-loop thresholds, production gates) and testing and documentation (adversarial testing, processing integrity thresholds, technical documentation, incident response plans, data lineage, audit trail enablement).
Post-deployment controls cover continuous monitoring (behavioral observability, drift detection, LLM request capture, performance metrics) and compliance lifecycle (certification currency, audit log review cadence, sub-processor SOC 2 review, quarterly access reviews, kill switch testing and decommissioning procedures).
Each control includes the applicable SOC 2 criteria codes so your compliance team can map them directly to your existing control matrix.
Roval’s architecture is designed to generate SOC 2 audit evidence as a byproduct of agent governance. The agent registry satisfies CC6.1 (identity and access), the LLM monitor satisfies CC7.2 (monitoring) and C1.2 (confidentiality) and compliance certification with auto-expiry satisfies CC4.1 (assessment). Drift detection every 15 minutes satisfies CC8.1 (change management). When the auditor arrives, evidence isn’t prepared; it’s exported. See how Roval maps to all five Trust Service Criteria with a solutions overview for compliance teams.