It's not reliable. The AI can just not prompt you to approve, or hide things, et... | Hacker News

HN2new | past | comments | ask | show | jobs | submit

		0xbadcafebee 5 days ago \| parent \| context \| favorite \| on: What I learned building an opinionated and minimal... It's not reliable. The AI can just not prompt you to approve, or hide things, etc. AI models are crafty little fuckers and they like to lie to you and find secret ways to do things with alterior motives. This isn't even a prompt injection thing, it's an emergent property of the model. So you must use an environment where everything can blow up and it's fine.

chr15m 5 days ago [–]

The harness runs the tool call for the LLM. It is trivial to not run the tool call without approval, and many existing tools do this.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact