It’s funny that this is probably due to bias in the training texts, right? Humans are way more likely to publish their “Eureka!” moments than their screwups… if they did, maybe models would’ve exhibit this behavior.
Now that AI labs have all these “Nevermind” texts to train on, maybe it’s getting easier to correct? (Would require some postprocessing to classify the AI outputs as successful or not before training)
I think it's more explicit than that, part of post-training to enforce the kind of behavior, I don't think it's emergent but rather researchers steering it to do that because they saw the CoT gets slightly better if the model tries to doubt itself or cheer itself on. Don't recall if there was a paper outlining this, tried finding where I got this from but searches/LLMing turns up nothing so far.
My understanding is that it’s the result of these companies making sure to keep you engaged/happy less than the result of data these companies train with.
I don’t know if it’s true or not but it certainly tracks given LLMs are way more polite than the average post on the internet lol
"At some point I realized that rather than do something else until it finishes, I would constantly check on it to see if it was asking for yet another permission, which felt like it was missing the point of having an agent do stuff"
Why don't Claude Code & other AI agents offer an option to make a sound or trigger a system notification whenever they prompt for approval? I've looked into setting this up, and it seems like I'd have to wire up a script that scrapes terminal output for an approval request. Codex has had a feature request open for a while: https://github.com/openai/codex/issues/3052
When using Claude Code in Ghostty on macOS, I get notifications if it is waiting on my input (accept changes, questionnaire, run bash command). Dunno what combination (if any) of my setup is needed for this to happen, but I certainly didn't configure anything special. Maybe I'm giving CC too much free reign to do things.
> wants it moved over to the others side of the utility closet so that when you open the door it is easier to put clothes from there into the dryer
That's when you swap the hinges so the door opens the other way, and you thank the manufacturer for providing such an easy solution to a common problem. It's good to keep things flexible and user-configurable.
Now quick, someone reply with a counterexample of how user configuration complicates the product and increases cost. It's design tradeoffs all the way down...
I don't know the answer, but there's a famous lost camera owned by a team that perished on Everest before Hillary/Norgay successfully climbed it. It's possible that the camera contains proof of an earlier first ascent: https://en.wikipedia.org/wiki/George_Mallory#Reaching_the_su...
Bad weather & bad visibility, most of the self driving solutions to date just hand back control to a human driver. In other words, hand back control at the last possible second in the most difficult driving to a human who has been in an otherwise relaxed and lest alert state. Winning!
It's like if spell check only worked on words shorter than 6 letters.
This is wrong, only the shittiest self driving in name only systems have that behavior. Any real attempts at it (i.e. not Tesla or comma) do not expect or require a human takeover immediately. For example, Cruise and waymo don't even have humans to take over, MB has a 10 second warning (and actually the human doesn't even need to takeover at that point, the system will slowly bring the car to a stop).
Well as we see Cruise's solution is to stop dead in the middle of the street until a service team comes out to resolve. If there was a human driver they would be handing back control.
Probably pretty fun if you happen to be a passenger when this happens.
Now that AI labs have all these “Nevermind” texts to train on, maybe it’s getting easier to correct? (Would require some postprocessing to classify the AI outputs as successful or not before training)
reply