Safe deployment
How do you limit what an AI feature is allowed to do?
Limit reach in three ways: grant the least access the task needs (least privilege), require a human to approve any irreversible action, and filter inputs and outputs because both can carry hidden instructions or leaked data.
Least privilege means the feature can touch only what its job requires, and nothing more. OWASP’s Excessive Agency risk is exactly the failure of ignoring this: it warns to “limit the extensions that LLM agents are allowed to call to only the minimum necessary” and to run actions “with user-specific credentials, not privileged shared accounts.”[1] The agent cheat sheet puts it as granting “the minimum tools required for their specific task” with “per-tool permission scoping (read-only vs. write, specific resources).”[3] A read-only feature should hold read-only keys. If it never needs to delete, it should not be able to.
Human approval for irreversible actions is the second control. OWASP advises using “human-in-the-loop control to require a human to approve high-impact actions before they are taken,” with sending an email as the example.[1] The cheat sheet adds a useful separation: “the agent can propose an action, but a policy service or execution component should independently validate scope, privilege, and approval state.”[3] The pattern is recommend, then approve, then execute. Drafting is safe; sending, paying, and deleting are the steps that wait for a person.
Filter what goes in and out. Treat every incoming message, retrieved document, and API response as untrusted, because any of them can carry hidden instructions aimed at the model.[3] OWASP’s prompt-injection guidance recommends applying “semantic filters” and string checks to scan for non-allowed content, validating that outputs match an expected format, and clearly separating untrusted content so it has less influence.[2] Output filtering also catches sensitive data on the way back out before a user or a downstream system sees it.
Source standards
- OWASP LLM06: Excessive Agency ↗
The risk of giving an LLM too much permission or autonomy, with mitigations: minimize tools, scope credentials, and require human approval for high-impact actions.
- OWASP AI Agent Security Cheat Sheet ↗
A practical pre-production checklist for agents: least privilege, human-in-the-loop, untrusted inputs, output validation, and logging.
References
- LLM06:2025 Excessive Agency — OWASP Gen AI Security Project
- LLM01:2025 Prompt Injection — OWASP Gen AI Security Project
- AI Agent Security Cheat Sheet — OWASP Cheat Sheet Series