Hacker News .hnnew | past | comments | ask | show | jobs | submit | tekacs's commentslogin

It kinda... does? The problem is that folks have been flailing on the right UX for this.

This is what build vs. plan mode _does_ in OpenCode. OpenAI has taken a different approach in Codex, where Plan mode can perform any actions (it just has an extra plan tool), but in OC in plan mode, IIRC write operations are turned off.

The screenshot shows that the experience had just flipped from Plan to Build mode, which is why the system reminder nudged it into acting!

Now... I forget, but OC may well be flipping automatically when you accept a plan, or letting the model flip it or any other kind of absurdity, but... folks are definitely trying to do the approval split in-harness, they're just failing badly at the UX so far.

And I fully believe that Plan vs. Build is a roundly mediocre UX for this.


The switch from plan mode to build is not always clearly defined. On a number of occasions, I've been in plan mode and enter a secondary follow up prompt to modify the plan. However, instead of updating the plan, the follow up text is taken as approval to build and it automatically switches to building.

Ask mode, on the other hand, has always explicitly indicated that I need to switch out of ask mode to perform any actions.

This is my experience with Cursor CLI.


Does Codex actually have a Plan Mode, or is there a mode switch I'm missing? I find myself having to manually tell it to 'make a plan' every time.

and if it has directory permissions, sometimes it just skips the confirmation step and starts executing as soon as it thinks the plan is ready.


cmd-shift-p (at least in vscode)

shift-tab in cli

It actually work, got "Plan mode (shift+tab to cycle)" at corner.

reading the manual , there is Slash commands /plan /plan switch to Plan mode

It seems that, unlike OpenCode, Codex doesn't show a notice for mode by default.


This applies well if you’re writing code.

But often I am using Claude to investigate a problem like this “why won’t this mDNS sender work” and it needs a bunch of trial and error steps to find the problem and each subsequent step is a brand new unanticipated command.


The OpenCode plan experience has been pretty bad (the community has accepted this, at least on Discord). The community's adopted a handful of plugins to make the experience better, and also guardrail when the agent switches versus doesn't

I think this means cost above. As in the extra cost you pay.

As a heavy magit user prior to jj, I can attest that I've just felt much less need for it in the wake of jj. Things like JJ's split being interactive and a lot of the commands having really neat short forms has meant that for me, as attached as I was, I still found myself benefiting so much that I switched.

It's worth scrolling down to the current implementation status part:

https://github.com/Dicklesworthstone/frankensqlite#current-i...

Although I will admit that even after reading it, I'm not exactly sure what the current implementation status is.


It's fake. It doesn't exist. It never happened. The whole thing is an LLM hallucination. You can notice that it's all half implemented if you read the code: https://github.com/Taufiqkemall2/frankensqlite/blob/main/cra...

We are going to get overwhelmed with this stuff aren't we.

The people who understand basic logic will be fine but I'm starting to think that's a very small group of people.

Honestly... ask an AI agent to update them for you.

They do an excellent job of reading documentation and searching to pick and choose and filter config that you might care about.

After decades of maintaining them myself, this was a huge breath of fresh air for me.


I kind of don't know what to think of startups that keep launching with things like this as their main facility?

Perhaps you can have a moment's attention like that, but... was it not apparent to everyone that this was going to be launched by the lab imminently?

Similarly, there are a bunch of stories about people's startups being killed by relatively trivial feature launches by OpenAI or Anthropic. And I find myself just super confused why people are chasing the super simple intermediary roles and then presenting as surprised.


My guess would be that they're building Apple internal hardware as a precursor? So that Apple can be the test customer?


I think that's reasonable, but then they should have the ability for the agent to, on the next call, override it. Even if it requires the agent to have read the file once or something.

In the absence of that you end up with what several of the harnesses ended up doing, where an agent will use a million tool calls to very slowly read a file in like 200 line chunks. I think they _might_ have fixed it now (or agent-fixes, my agent harness might be fixing it), but Codex used to do this and it made it unbelievably slow.


You’re describing peek.

An agent needs to be able to peek before determining “Can I one shot this or does it need paging?”


Yep, I previously implemented it under that name in my own harness. That being said, there is value in actually performing a normal read, because you do often complete it on that first glance.


Confession, I too implemented a “smart” read. A read unless it’s over a size, then it’s paged, or if it’s a specific format, a summary. However, I also supply `cat`


It seems hard to tell what to think of a company that is simultaneously trying to poison the content that it sends to agents [1] and also doing things like this.

I understand their arguments for it - and completely disagree - so I can't help but think that anyone who is on the pro-AI side of things would do well to steer clear of them if possible.

[1]: https://arstechnica.com/ai/2025/03/cloudflare-turns-ai-again...


I mean, is it possible that they could run the full-size model on it, but doing so on the smaller amount of hardware that they have is a worse trade-off for now, and it's better to run more of the smaller model so that it can actually provide capacity to people?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: