AI Agents Need Hard Boundaries

In General

AI Agents need HARD boundaries. Prompt instructions are not hard boundaries.

Here is an example:

When I want my agent to read and process my X feed, I can authenticate it with a Bearer token - which gives it the same permissions as I do. I can then tell the agent in the prompt that it should only read the feed and should never tweet, never DM.

This is a soft boundary. The agent could ignore my instructions and do any of those actions. And once in a while in a complex workflow, it probably will.

On the other hand, if I authenticate via Oauth, then I can configure read-only scopes. Even if the agent ignores my instructions in the prompt, it wont have authorisation to tweet or DM. This is a hard boundary.

The same thing for coding agents: Invest in a robust CI pipeline where you run tests, lint the code etc. Even if the agent ignores your instructions to ensure tests pass and code lints, it still wont get past the CI.