Agent decommissioning: how to securely offboard AI agents
Last quarter, a fintech company retired its customer-onboarding agent. The team disabled the front-end trigger, closed the Jira ticket and moved on. Three months later, a penetration test found the agent’s service account still active, still holding write access to the customer database, still authenticated to four downstream APIs. Nobody had revoked anything because nobody had a checklist that said they should.
This is the norm, not the exception. Only 10% of companies have well-developed lifecycle strategies for machine identities, according to Okta’s 2025 identity report. The other 90% are accumulating ghost agents: dormant identities with live privileges, invisible to monitoring, waiting for an attacker to notice.
Decommissioning an AI agent is not the same as decommissioning a server. It is closer to offboarding an employee who memorized your passwords, built their own tools and never told anyone which systems they connected to.
Why agents are harder to retire than software#
When you decommission a traditional application, you remove a binary, revoke a service account and update a CMDB record. The application did not learn, did not accumulate context and did not create its own API connections on the fly.
AI agents are different in four ways that make retirement dangerous:
- Persistent state: agents store conversation memory, vector embeddings, fine-tuned model weights and cached reasoning chains; this state can contain customer data, proprietary logic or PII that persists in storage systems you may not even know about
- Credential sprawl: non-human identities now outnumber human users by 45 to 1 in the average enterprise and some organizations see ratios above 100 to 1, according to Rubrik Zero Labs research; every agent creates service accounts, API keys, OAuth tokens and webhook registrations that rarely have expiration dates
- Downstream dependencies: a single agent might feed data to three other agents, trigger webhook-based workflows and write to a shared database that other systems read from; disable the agent without mapping these connections and you get cascading failures on Monday morning
- Learned behaviors: unlike static software, agents evolve by adapting their outputs based on feedback loops, adjusting their tool usage patterns and developing operational behaviors that exist nowhere in their original code, so retirement means not just stopping the process but understanding what the process became
Ghost agent risk
Orphaned and inactive non-human identities factor into approximately 20% of insider-related breaches. When an agent is retired without credential revocation, its service accounts become dormant identities that retain valid authentication tokens, making them prime targets for lateral movement.
Source: Saviynt, 2026
The UK’s National Cyber Security Centre puts it directly: decommissioning is “a critical phase in the lifecycle of any asset,” and organizations should ensure “replacement assets are in place and working before irreversible actions are taken.” For AI agents, that warning carries extra weight, because the “irreversible action” might be losing the only system that understands a critical workflow.
The seven-step decommissioning protocol#
Every agent retirement should follow a structured sequence. Skip a step and you leave either a security gap or an operational blind spot.
1. Dependency audit#
Before you touch anything, map the agent’s complete footprint.
- Every API it calls and every API that calls it
- Every database it reads from or writes to
- Every other agent that depends on its output
- Every webhook, cron job or event trigger it registered
- Every message queue or pub/sub topic it subscribes to
This is the step most teams skip and it is the step that causes the most damage when skipped. If you cannot describe an agent’s footprint, you cannot safely retire it.
A centralized agent registry makes this step possible. Without one, dependency mapping becomes an archaeological expedition through Slack threads and README files.
2. Access revocation#
Revoke every credential the agent holds. This means:
- API keys: delete, do not just disable
- OAuth tokens and refresh tokens: revoke at the authorization server
- Service accounts: remove from IAM, not just “deactivate”
- Cached tokens in the agent’s runtime environment: purge
- Webhook secrets: rotate for any remaining consumers
- Certificates and mTLS credentials: revoke and remove from trust stores
- Shared credentials: if the agent used any shared secrets, rotate them for all remaining consumers
The order matters. Revoke outbound access first (the agent calling other systems), then inbound access (other systems calling the agent). This prevents the agent from taking unexpected actions during its shutdown window.
3. State preservation#
Before you delete anything, decide what needs to be archived.
Some agent state has legal or regulatory value: decision logs, audit trails, customer interaction records. Other state is operational: conversation memory, embeddings, cached tool outputs. And some state is sensitive: PII processed during operation, credentials stored in memory, proprietary training data.
Separate these categories and handle each according to your data classification policy. The NCSC recommends choosing between “destruction or archiving based on asset sensitivity” and evaluating “whether you’ll need to explain or defend decisions based on the model’s outputs.”
4. Data retention compliance#
This is where regulation meets operations.
EU AI Act logging requirements
Article 12 of the EU AI Act requires high-risk AI systems to have logging capabilities that “enable the recording of events relevant for identifying situations that may result in the AI system presenting a risk.” Logs must be retained for at least six months. Article 18 requires technical documentation to be kept for ten years.
Source: EU AI Act, Article 12
For organizations operating under the EU AI Act, decommissioning records are not optional. You need to preserve:
- The complete operational log trail through the agent’s final day of operation
- Technical documentation describing the agent’s design, capabilities and risk classification
- Records of who authorized the decommissioning and why
- Evidence that credential revocation was completed
- Confirmation of data sanitization for any state that was destroyed
The six-month minimum for operational logs is a floor, not a ceiling. If the agent made decisions that could be challenged, such as credit scoring, hiring recommendations or insurance underwriting, retain records for as long as those decisions could be disputed.
5. Downstream notification#
Every system that consumed the agent’s output needs to know it is going away. This includes:
- Other agents in a multi-agent workflow that depended on this agent’s responses
- Human operators who used the agent’s output for decision-making
- Monitoring systems that tracked the agent’s health metrics
- Compliance teams that included the agent in their risk registers
Give downstream consumers enough lead time to implement alternatives. A hard shutdown with no warning is the fastest path to a production incident.
6. Documentation#
Write the death certificate. Record:
- What the agent did (original purpose and how it evolved)
- Why it was decommissioned (obsolescence, replacement, risk, cost)
- What was preserved and where it is stored
- What was destroyed and how
- Who approved each step
- The date and time of final shutdown
This documentation serves two purposes. First, it creates the audit trail that regulators will ask for. Second, it prevents a future team from accidentally rebuilding the same agent because nobody recorded why the first one was retired.
7. Post-removal verification#
After shutdown, verify that the agent is gone.
- Scan IAM systems for any remaining service accounts or roles
- Check API gateways for orphaned routes or keys
- Search container registries for dormant images
- Verify that no cron jobs or scheduled tasks still reference the agent
- Confirm that monitoring dashboards no longer show the agent as “healthy” (a common false positive that masks incomplete decommissioning)
- Run a targeted penetration test against the agent’s former access paths
This step catches the 20% of decommissioning attempts that leave something behind.
Every AI identity has a birth, life and retirement that must be governed appropriately or enterprise risk multiplies.
Employee offboarding as a decommissioning trigger#
When a developer leaves your company, IT revokes their laptop access, disables their email and removes their badge. But nobody checks whether that developer created three AI agents that are still running in production.
This is the agent orphaning problem. The person who understood the agent’s purpose, dependencies and quirks is gone. The agent keeps running. Its credentials stay active. Its outputs continue flowing into downstream systems. And nobody on the remaining team knows enough to maintain it, let alone safely retire it.
Every employee exit should trigger an automated check: does this person own any registered agents? If yes, the offboarding workflow should force a decision:
- Transfer ownership to a named individual (not a team alias, not a shared account)
- Schedule decommissioning within a defined window
- Escalate to the CISO if the agent handles high-risk data or regulated decisions
HR systems and agent lifecycle management tools should be connected. If they are not, you are relying on the departing employee to remember to mention their agents during their exit interview. That is not a governance strategy. That is hope.
The scale of the problem
Gartner predicts 40% of enterprise applications will feature task-specific AI agents by 2026, up from less than 5% in 2025. Meanwhile, only 21% of companies have a mature governance model for these agents.
Source: Gartner, August 2025
Automating the decommissioning workflow#
Manual decommissioning does not scale. If your organization runs 50 agents today and will run 500 by next year, you need a process that executes without a human remembering every step.
The automation targets are clear:
- Automatic dependency scanning: when an agent is flagged for retirement, the system maps its connections, identifies consumers and generates a dependency report (no Slack threads, no archaeology)
- Credential revocation cascades: a single decommissioning trigger revokes all associated credentials in sequence (outbound first, then inbound), confirming each revocation before proceeding
- State classification and archival: based on pre-configured data classification rules, the system routes agent state to the appropriate destination such as long-term archive, sanitization queue or destruction
- Compliance package generation: the system assembles the required documentation automatically, including operational logs, decision records, technical specifications and the decommissioning approval chain
- Post-removal verification scans: after shutdown, automated scans check IAM, API gateways, container registries and monitoring systems for orphaned artifacts
The infrastructure patterns here are familiar. Your CMDB tracks every server. Your ITAM system manages hardware retirement. Your HR system automates employee offboarding. AI agents deserve the same discipline. The difference is that agents need a purpose-built registry that understands agent-specific artifacts like embeddings, model weights and tool registrations.
The cost of getting this wrong#
The RSA Conference 2026 exposed a gap that should worry every CISO. Major vendors including CrowdStrike, Cisco, Palo Alto Networks and Microsoft shipped agent identity frameworks, but none can confirm a decommissioned agent holds zero credentials. The verification problem remains unsolved at the platform level.
That means the burden falls on your team. And the cost of failure is not theoretical:
- Security exposure. Ghost agents with live credentials are the quietest, most dangerous form of privilege persistence. They do not trigger login anomalies because they were never “logged out.”
- Compliance violations. Under the EU AI Act, failure to maintain proper decommissioning records for high-risk AI systems can result in fines of up to 3% of annual global turnover. Under GDPR, PII retained in agent memory after it should have been deleted is a data protection violation.
- Operational cascading failures. An improperly retired agent in a multi-agent system can cause silent data pipeline breaks that take days to diagnose, because the monitoring system still shows the agent as healthy.
- Financial waste. Zombie agents consume compute, storage and API call budgets. In organizations experiencing agent sprawl, the accumulated cost of forgotten agents can reach six figures annually.
Agent decommissioning checklist#
Use this as a starting point. Adapt it to your regulatory environment and risk appetite.
Pre-decommissioning
- Confirm agent ownership and decommissioning authorization
- Complete dependency audit (APIs, databases, downstream agents, webhooks)
- Notify all downstream consumers with timeline
- Verify fallback procedures are in place and tested
- Classify all agent state (archive, sanitize, destroy)
Execution
- Revoke outbound credentials (API keys, OAuth tokens, service accounts)
- Revoke inbound credentials (webhook secrets, certificates)
- Rotate shared credentials used by the agent
- Archive required operational logs and decision records
- Sanitize or destroy non-retained state (memory, embeddings, cached data)
- Remove container images and scheduled tasks
- Update the agent registry to reflect decommissioned status
Post-decommissioning
- Scan IAM for orphaned service accounts
- Check API gateways for orphaned routes
- Verify monitoring systems no longer report the agent as active
- Assemble compliance documentation package
- Store decommissioning records per retention policy
- Conduct post-removal penetration test (for high-risk agents)