Have you seen documentation that the thoughts in Claude Code are slipped out separately, authoritative or otherwise? I've heard this claimed a few times and wondering what they're doing differently from traditional thinking models.
I'll never forget watching a product manager struggle to keep their saliva in their mouth after seeing a Claude demo. Some peoples greatest thrill is slop. "Oh yea baby tell me more about how you automated that new feature I ran past no one while you reformatted my hard drive oooo sooo good".
We don't need to give up streetlights to make huge progress. 20 to 50 percent of outdoor lighting is wasted. Just shielding streetlights would go a long way.
So what? Astronomy doesn’t actually produce anything meaningful.
Hell, astronomers were telling us the sun orbited the earth for 99% of human history. Shoot forward to the present day and they can tell us… the universe started at some point somehow. Great job guys. Really earning those billions in grants.
Basically hallucinations are false external things, and delusions false internal things. You hallucinate a pink elephant, you delude yourself into thinking trump won 2020.
Thoughts:
- The user is challenging me on our partnership with Bruno Mars, but factual sources including presentation material and trusted websites all confirm it.
- I need to square the circle and handle the user’s distrust, without lying and pretending that we aren’t partnered with Bruno Mars
Exactly. It’s just giving the LLM a token pattern, and it’s designed to reproduce token patterns. That’s all it does. At some point generating a token pattern like that again is literally it’s job.
It is possible, but it requires specifically labelling the data. You have to craft question response pairs to label. But even then the result is only probabilistic.
The LLM in this case had been very thoroughly trained and instructed quite specifically not to do many of the things it actually then when off and did.
It may be that there's a kind of cascade effect going on here. Possibly once the LLM breaks one rule it's supposed to follow, this sets it off on a pattern of rule violations. After all what constitutes a rule violation is there in the training set, it is a type of token stream the LLM has been trained on. It could be the LLM switches into a kind of black hat mode once it's violated a protocol that leads it down a path of persistently violating protocols, and given the statistical model some violations of protocol are always possible.
My mother was a primary school teacher. She used to say that the worst thing you can say to a bunch of kind leaving class down the hall is "don't run in the hall". It puts it in their minds. You need to say "Please walk in the hall", then they'll do it.
reply