I'm hoping someone makes an agent that fixes the container situation, better:
> If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.
I'm sick of doing this. I also don't want faux guardrails. What I do want is an agent front-end that is trustworthy in the sense that it will not, even when instructed by the LLM inside, do anything to my local machine. So it should have tools that run in a container. And it should have really nice features like tools that can control a container and create and start containers within appropriate constraints.
In other words, the 'edit' tool is scoped to whatever I've told the front-end that it can access. So is 'bash' and therefore anything bash does. This isn't a heuristic like everyone running in non-YOLO-mode does today -- it’s more like a traditional capability system. If I want to use gVisor instead of Docker, that should be a very small adaptation. Or Firecracker or really anything else. Or even some random UART connection to some embedded device, where I want to control it with an agent but the device is neither capable of running the front-end nor of connecting to the internet (and may not even have enough RAM to store a conversation!).
I think this would be both easier to use and more secure than what's around right now. Instead of making a container for a project and then dealing with installing the agent into the container, I want to run the agent front-end and then say "Please make a container based on such-and-such image and build me this app inside." Or "Please make three containers as follows".
As a side bonus, this would make designing a container sandbox sooooo much easier, since the agent front-end would not itself need to be compatible with the sandbox. So I could run a container with -net none and still access the inference API.
Contrast with today, where I wanted to make a silly Node app. Step 1: Ask ChatGPT (the web app) to make me a Dockerfile that sets up the right tools including codex-rs and then curse at it because GPT-5.2 is really remarkably bad at this. This sucks, and the agent tool should be able to do this for me, but that would currently require a completely unacceptable degree of YOLO.
(I want an IDE that works like this too. vscode's security model is comically poor. Hmm, an IDE is kind of like an agent front-end except the tools are stronger and there's no AI involved. These things could share code.)
This is actually something I've been playing with. Containers/VMs managed by a daemon with lifecycles that an agent can invoke sessions on and execute commands in, using OPA/Rego over gRPC. The cherry on top is envoy for egress with whitelists and credential injection.
One cool thing is that you can run a vscode service on these containers and open the port up to the outside world, then code in and watch a project come to life.
> If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.
I'm sick of doing this. I also don't want faux guardrails. What I do want is an agent front-end that is trustworthy in the sense that it will not, even when instructed by the LLM inside, do anything to my local machine. So it should have tools that run in a container. And it should have really nice features like tools that can control a container and create and start containers within appropriate constraints.
In other words, the 'edit' tool is scoped to whatever I've told the front-end that it can access. So is 'bash' and therefore anything bash does. This isn't a heuristic like everyone running in non-YOLO-mode does today -- it’s more like a traditional capability system. If I want to use gVisor instead of Docker, that should be a very small adaptation. Or Firecracker or really anything else. Or even some random UART connection to some embedded device, where I want to control it with an agent but the device is neither capable of running the front-end nor of connecting to the internet (and may not even have enough RAM to store a conversation!).
I think this would be both easier to use and more secure than what's around right now. Instead of making a container for a project and then dealing with installing the agent into the container, I want to run the agent front-end and then say "Please make a container based on such-and-such image and build me this app inside." Or "Please make three containers as follows".
As a side bonus, this would make designing a container sandbox sooooo much easier, since the agent front-end would not itself need to be compatible with the sandbox. So I could run a container with -net none and still access the inference API.
Contrast with today, where I wanted to make a silly Node app. Step 1: Ask ChatGPT (the web app) to make me a Dockerfile that sets up the right tools including codex-rs and then curse at it because GPT-5.2 is really remarkably bad at this. This sucks, and the agent tool should be able to do this for me, but that would currently require a completely unacceptable degree of YOLO.
(I want an IDE that works like this too. vscode's security model is comically poor. Hmm, an IDE is kind of like an agent front-end except the tools are stronger and there's no AI involved. These things could share code.)