Don't anthropomorphize the language model. If you stick your hand in there, it'll chop it off. It doesn't care about your feelings. It can't care about your feelings.
> Do not fall into the trap of anthropomorphizing Larry Ellison. You need to think of Larry Ellison the way you think of a lawnmower. You don’t anthropomorphize your lawnmower, the lawnmower just mows the lawn - you stick your hand in there and it’ll chop it off, the end. You don’t think "oh, the lawnmower hates me" – lawnmower doesn’t give a shit about you, lawnmower can’t hate you. Don’t anthropomorphize the lawnmower. Don’t fall into that trap about Oracle.
A more direct source (possibly the original source?) I know of is a YouTube video entitled "LISA11 - Fork Yeah! The Rise and Development of illumos" which detailed how the Solaris operating system got freed from Oracle after the Sun acquisition.
The whole hour talk is worth a watch, even when passively doing other stuff. It is a neat history of Solaris and its toolchain mixed with the inter-organizational politics.
It's also important to realize that AI agents have no time preference. They could be reincarnated by alien archeologists a billion years from now and it would be the same as if a millisecond had passed. You, on the other hand, have to make payroll next week, and time is of the essence.
Well there were a bunch of articles about resuming a parked session relating to degradation of capabilities and high token usage.
Ironic Another example of attempting to treat the LLM as an AI
They don't have time preference because they don't have intent or reasoning. They can't be "reincarnated" because they're not sentient, they're a series of weights for probable next tokens.
No. They don't have time preference like us, because (wall clock) time doesn't exist for them. An LLM only "exists" when it is actively processing a prompt or generating tokens. After it is done, it stops existing as an "entity".
A real world second doesn't mean anything to the LLM from its own perspective. A second is only relevant to them as it pertains to us.
Time for LLMs is measured in tokens. That's what ticks their clock forward.
I suppose you could make time relevant for an LLM by making the LLM run in a loop that constantly polls for information. Or maybe you can keep feeding it input so much that it's constantly running and has to start filtering some of it out to function.
That would still be time as it pertains to us. Even if I put time stamps into the chat all the LLM knows that it's some amount of time later - it can't actually do anything in the time between two prompts.
Can we maybe make it "don't anthropoCENTRIZE the LLMs" .
The inverse of anthropomorphism isn't any more sane, you see. By analogy: just because a drone is not an airplane, doesn't mean it can't fly!
Instead, just look at what the thing is doing.
LLMs absolutely have some form of intent (their current task) and some form of reasoning (what else is step-by-step doing?) . Call it simulated intent and simulated reasoning if you must.
Meanwhile they also have the property where if they have the ability to destroy all your data, they absolutely will find a way. (Or: "the probability of catastrophic action approaches certainty if the capability exists" but people can get tired of talking like that).
> LLMs absolutely have intent (their current task)
That's like saying a 2000cc 4-Cylinder Engine "has the intent to move backward". Even with a very generous definition of "intent", the component is not the system, and we're operating in context where the distinction matters. The LLM's intent is to supply "good" appended text.
If it had that kind of intent, we wouldn't be able to make it jump the rails so easily with prompt injection.
> and reasoning (what else is step-by-step doing?) .
Oh, that's easy: "Reasoning" models are just tweaking the document style so that characters engage in film noir-style internal monologues, latent text that is not usually acted-out towards the real human user.
Each iteration leaves more co-generated clues for the next iteration to pick up, reducing weird jumps and bolstering the illusion that the ephemeral character has a consistent "mind."
> That's like saying a 2000cc 4-Cylinder Engine "has the intent to move backward". Even with a very generous definition of "intent", the component is not the system, and we're operating in context where the distinction matters. The LLM's intent is to supply "good" appended text.
Fair, but typically you use a 2000cc engine in a car. Without the gearbox, drive train, wheels, chassis, etc attached, the engine sits there and makes noise. When used in practice, it does in fact make the car go forward and backward.
Strictly the model itself doesn't have intent, ofc. But in practice you add a context, memory system, some form of prompting requiring "make a plan", and especially <Skills> . In practice there's definitely -well- a very strong directionality to the whole thing.
> and bolstering the illusion that the ephemeral character has a consistent "mind."
And here I thought it allowed a next token predictor to cycle back to the beginning of the process, so that now you can use tokens that were previously "in the future". Compare eg. multi pass assemblers which use the same trick.
> LLMs absolutely have some form of intent (their current task)
They have momentum, not intent. They don’t think, build a plan internally, and then start creating tokens to achieve the plan. Echoing tokens is all there is. It’s like an avalanche or a pachinko machine, not an animal.
> some form of reasoning (what else is step-by-step doing?)
I think they reflect the reasoning that is baked into language, but go no deeper. “I am a <noun>” is much more likely than “I am a <gibberish>”. I think reasoning is more involved than this advanced game of mad libs.
Apologies, I tend to use web chats and agent harnesses a lot more than raw LLMs.
Strictly for raw models, most now do train on chain-of-thought, but the planning step may need to be prompted in the harness or your own prompt. Since the model is autoregressive, once it generates a thing that looks like a plan it will then proceed to follow said plan, since now the best predicted next tokens are tokens that adhere to it.
Or, in plain english, it's fairly easy to have an AI with something that is the practical functional equivalent of intent, and many real world applications now do.
You realize the generation of the "Chain-of-thought" is also autoregressive, right?
It's not a real reasoning step, it's a sequence of steps, carried out in English (not in the same "internal space" as human thought - every time the model outputs a token the entire internal state vector and all the possibilities it represents is reduced down to a concrete token output) that looks like reasoning. But it is still, as you say, autoregressive.
And thus - in plain english - it is determined entirely by the prompt and the random initial seed. I don't know what that is but I know it's not intent.
So I already rewrote and deleted this more times than I can count, and the daystar is coming up. I realize I got caught up in the weeds, and my core argument was left wanting. Sorry about that. Regrouping then ...
Anthropomorphism and Anthropodenial are two different forms of Anthropocentrism.
But the really interesting story to me is when you look at the LLM in its own right, to see what it's actually doing.
I'm not disputing the autoregressive framing. I fully admit I started it myself!
But once we're there, what I really wanted to say (just like Turing and Dijkstra did), is that the really interesting question isn't "is it really thinking?" , but what this kind of process is doing, is it useful, what can I do or play with it, and -relevant to this particular story- what can go (catastrophically) wrong.
I don't know if they have intent. I know it's fairly straightforward to build a harness to cause a sequence of outputs that can often satisfy a user's intent, but that's pretty different. The bones of that were doable with GPT-3.5 over three years ago, even: just ask the model to produce text that includes plans or suggests additional steps, vs just asking for direct answers. And you can train a model to more-directly generate output that effectively "simulates" that harness, but it's likewise hard for me to call that intent.
I think it’s helpful to try to use words that more precisely describe how the LLM works. For instance, “intent” ascribes a will to the process. Instead I’d say an LLM has an “orientation”, in that through prompting you point it in a particular direction in which it’s most likely to continue.
If you claim something might "very well" be something you state you need some better proof. Otherwise we might also "very well" be living in the matrix.
That is a silly point. We very clearly are not "a series of weights for probable next tokens", as we can reason based on prior data points. LLMs cannot.
Unless you're using some mystical conception of "reason", nothing about being able to "reason based on prior data points" translates to "we very clearly are not a series of weights for probable next tokens".
And in fact LLMs can very well "reason based on prior data points". That's what a chat session is. It's just that this is transient for cost reasons.
People always say this kind of thing. Human minds are not Turing machines or able to be simulated by Turing machines. When you go about your day doing your tasks, do you require terajoules of energy? I believe it is pretty clear human thinking is not at all like a computer as we know them.
>People always say this kind of thing. Human minds are not Turing machines or able to be simulated by Turing machines
That's just a claim. Why so? Who said that's the case?
>When you go about your day doing your tasks, do you require terajoules of energy?
That's the definition of irrelevant. ENIAC needed 150 kW to do about 5,000 additions per second. A modern high-end GPU uses about 450 W to do around 80 trillion floating-point operations per second. That’s roughly 16 billion times the operation rate at about 1/333 the power, or around 5 trillion times better energy efficiency per operation.
Given such increase being possible, one can expect a future computer being able to run our mental tasks level of calculation, with similar or better efficiency than us.
Furthermore, "turing machine" is an abstraction. Modern CPUs/GPUs aren't turing machines either, in a pragmatic sense, they have a totally different architecture. And our brains have yet another architecture (more efficient at the kind of calculations they need).
What's important is computational expressiveness, and nothing you wrote proves that the brains architecture can't me modelled algorithmically and run in an equally efficient machine.
Even equally efficient is a red herring. If it's 1/10000 less efficient would it matter for whether the brain can be modelled or not? No, it would just speak to the effectiveness of our architecture.
We are much more than weights which output probable next tokens.
You are a fool if you think otherwise. Are we conscious beings? Who knows, but we’re more than a neural network outputting tokens.
Firstly, and most obviously, we aren’t LLMs, for Pete’s sake.
There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all? I don’t know, but the training humans get is coupled with the pain and embarrassment of mistakes, the ability to learn while training (since we never stop training, really), and our own desires to reach our own goals for our own reasons.
I’m not spiritual in any way, and I view all living beings as biological machines, so don’t assume that I am coming from some “higher purpose” point of view.
>We are much more than weights which output probable next tokens.
You are a fool if you think otherwise. Are we conscious beings? Who knows, but we’re more than a neural network outputting tokens.
That's just stating a claim though. Why is that so?
Mine is reffering to the "brain as prediction machine" establised theory. Plus on all we know for the brain's operation (neurons, connections, firings, etc).
>There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all?
What parts aren't? Can those parts still be algorithmically described and modelled as some information exchange/processing?
>but the training humans get is coupled with the pain and embarrassment of mistakes
Those are versions of negative feedback. We can do similar things to neural networks (including human preference feedback, penalties, and low scores).
>the ability to learn while training (since we never stop training, really)
I already covered that: "The main difference is the training part and that it's always-on."
We do have NNs that are continuously training and updating weights (even in production).
For big LLMs it's impractical because of the cost, otherwise totally doable. In fact, a chat session kind of does that too, but it's transient.
They're not artificial intelligence neural networks.
They're biological neural networks. Brains are made of neurons (which Do The Thing... mysteriously, somehow. Papers are inconclusive!) , Glia Cells (which support the neurons), and also several other tissues for (obvious?) things like blood vessels, which you need to power the whole thing, and other such management hardware.
Bioneurons are a bit more powerful than what artificial intelligence folks call 'neurons' these days. They have built in computation and learning capabilities. For some of them, you need hundreds of AI neurons to simulate their function even partially. And there's still bits people don't quite get about them.
But weights and prediction? That's the next emergence level up, we're not talking about hardware there. That said, the biological mechanisms aren't fully elucidated, so I bet there's still some surprises there.
We very obviously are not just a series of weights for probable next tokens. Like seriously, you can even ask an LLM and it will tell you our brains work differently to it, and that’s not even including the possibility that we have a soul or any other spiritual substrait.
>We very obviously are not just a series of weights for probable next tokens.
How exactly? Except via handwaving? I refer to the "brain as prediction machine theory" which is the dominant one atm.
>you can even ask an LLM and it will tell you our brains work differently to it
It will just tell me platitudes based on weights of the millions of books and articles and such on its training. Kind of like what a human would tell me.
>and that’s not even including the possibility that we have a soul or any other spiritual substrait.
That's good, because I wasn't including it either.
"brain as prediction machine theory" is dominant among whom, exactly? Is it for the same reason that the "watchmaker analogy" was 'dominant' when clockwork was the most advanced technology commonly available?
Its really just a matter of degrees. There are 1 million, 1 million, 1 trillion parameter LLMs... and you keep scaling those parameters and you eventually get to humans. But it's still probable next tokens (decisions) based on previous tokens (experience).
> Its really just a matter of degrees. There are 1 million, 1 million, 1 trillion parameter LLMs... and you keep scaling those parameters and you eventually get to humans.
It isn’t because humans and current LLMs have radically different architectures
LLMs: training and inference are two separate processes; weights are modifiable during training, static/fixed/read-only at runtime
Humans: training and inference are integrated and run together; weights are dynamic, continuously updated in response to new experiences
You can scale current LLM architectures as far as you want, it will never compete with humans because it architecturally lacks their dynamism
Actually scaling to humans is going to require fundamentally new architectures-which some people are working on, but it isn’t clear if any of them have succeeded yet
> LLMs: training and inference are two separate processes
True, but we have RAG to offset that.
> it architecturally lacks their dynamism
We'll get there eventually. Keep in mind that the brain is now about 300k years into fine-tuning itself as this species classified as homo sapiens. LLMs haven't even been around for 5 years yet.
In practice that doesn’t always work… I’ve seen cases where (a) the answer is in the RAG but the model can’t find it because it didn’t use the right search terms-embeddings and vector search reduces the incidence of that but cannot eliminate it; (b) the model decided not to use the search tool because it thought the answer was so obvious that tool use was unnecessary; (c) model doubts, rejects, or forgets the tool call results because they contradict the weights; (d) contradictions between data in weights and data in RAG produce contradictory or ineloquent output; (e) the data in the RAG is overly diffuse and the tool fails to surface enough of it to produce the kind of synthesis of it all which you’d get if the same info was in the weights
This is especially the case when the facts have changed radically since the model was trained, e.g. “who is the Supreme Leader of Iran?”
> We'll get there eventually. Keep in mind that the brain is now about 300k years into fine-tuning itself as this species classified as homo sapiens. LLMs haven't even been around for 5 years yet.
We probably will eventually-but I doubt we’ll get there purely by scaling existing approaches-more likely, novel ideas nobody has even thought of yet will prove essential, and a human-level AI model will have radical architectural differences from the current generation
LOL. Oook.. No i dont think so. The human experience and the mechanisms behind it have a lot of unknowns and im pretty sure that trying to confine the human experience into the amount of parameters there are is short sighted.
Still many unknowns, but we do know some key fundamentals, such as that the brain is "just" trillions of neurons organized in various ways that keep firing (going from high to low electric potential) at different rates. Pretty similar to how the fundamental operation of today's digital computers is the manipulation of 0s and 1s.
They’re both neural networks, but the architectures built using those neural connections, and the way they are trained and operate are completely different. There are many different artificial neural network architectures. They’re not all LLMs.
AlphaZero isn’t a LLM. There are Feed Forward networks, recurrent networks, convolutional networks, transformer networks, generative adversarial networks.
Brains have many different regions each with different architectures. None of them work like LLMs. Not even our language centres are structured or trained anything like LLMs.
I'd argue that regardless of the architecture, the more sophisticated brain is still a (massive) language model. If you really think about it, language is the construct that allows brains to go beyond raw instinct and actually create concepts that're useful for "intelligently" planning for the future. The real difference is that brains are trained with raw sensory data (nerve impulses) while today's LLMs are trained with human-generated data (text, images, etc).
It's not at all a language model in the way that LLMs are. At this point we might as well just say that both process information, that's about the level of similarity they have except for the implementation detail of neurons.
Language came after conceptual modeling of the world around us. We're surrounded by social species with theory of mind and even the ability to recognise themselves and communicate with each other, but none of them have language. Even the communications faculties they have operate in completely different parts of their brains than ours with completely different structure. Actually we still have those parts of the brain too.
Conceptual representation and modeling came first, then language came along to communicate those concepts. LLMs are the other way around, linguistic tokens come first and they just stream out more of them.
This is why Noam Chomsky was adamant that what LLMs are actually doing in terms of architecture and function has nothing to do with language. At first I thought he must be wrong, he mustn't know how these things work, but the more I dug into it the more I realised he was right. He did know, and he was analysing this as a linguist with a deep understanding of the cognitive processes of language.
To say that brains are language models you have to ditch completely what the term language model actually means in AI research.
That's a different statement, yes brains and LLMs are both neural networks.
An LLM is a specific neural architectural structure and training process. Brains are also neural networks, but they are otherwise nothing at all like LLMs and don't function the ways LLMs do architecturally other than being neural networks.
Plus, brain structure and physiology changes thoughout the interweaved processes of learning, aging, acting, emoting, recalling, what have you. It's not an "architecture" that we can technologically recreate, as so much of it emerges from a vastly higher level of complexity and dynamism.
Our brains work differently, yes. What evidence do you have that our brains are not functionally equivalent to a series of weights being used to predict the next token?
I'm not claiming that to be the case, merely pointing out that you don't appear to have a reasonable claim to the contrary.
> not even including the possibility that we have a soul or any other spiritual substrait.
If we're going to veer off into mysticism then the LLM discussion is also going to get a lot weirder. Perhaps we ought to stick to a materialist scientific approach?
You are setting the bar in a way that makes “functional equivalence” unfalsifiable.
If by “functionally equivalent” you mean “can produce similar linguistic outputs in some domains,” then sure we’re already there in some narrow cases. But that’s a very thin slice of what brains do, and thus not functionally equivalent at all.
There are a few non-mystical, testable differences that matter:
- Online learning vs. frozen inference: brains update continuously from tiny amounts of data, LLMs do not
- Grounding: human cognition is tied to perception, action, and feedback from the world. LLMs operate over symbol sequences divorced from direct experience.
- Memory: humans have persistent, multi-scale memory (episodic, procedural, etc.) that integrates over a lifetime. LLM “memory” is either weights (static) or context (ephemeral).
- Agency: brains are part of systems that generate their own goals and act on the world. LLMs optimize a fixed objective (next-token prediction) and don’t have endogenous drives.
I did not claim the ability of current LLMs to be on par with that of humans (equivalently human brains). I objected that you have not presented evidence refuting the claim that the core functionality of human brains can be accomplished by predicting the next token (or something substantially similar to that). None of the things you listed support a claim on the matter in either direction.
I don't follow. If you provide criteria I can most likely provide evidence, unless your criteria is "vaguely cylindrical and vaguely squishy" in which case I obviously won't be able to.
The person I replied to made a definite claim (that we are "very obviously not ...") for which no evidence has been presented and which I posit humanity is currently unable to definitively answer in one direction or the other.
When two things are obviously radically different (a squishy mass of trillions of interconnected carbon based blobs fed by some sort of continuous oxygen based chemical reaction, and a series of distributed transitors on silicon wafers) then the burden of proof shifts to the other guy to provide the clear and convincing evidence that they should be considered functionally the same thing.
But I made no such claim. I was explicit that my position is "humanity is currently unable to definitively answer in one direction or the other".
Two things being physically different does not exclude their also having functional similarities. The argument presented amounts to A and B have large physical differences, A does X, therefore B does not do X. That doesn't follow.
Right. This line [0] from TFA tells me that the author needs to thoroughly recalibrate their mental model about "Agents" and the statistical nature of the underlying models.
[0] "This is the agent on the record, in writing."
Actually I think the opposite advice is true. Do anthropomorphize the language model, because it can do anything a human -- say an eager intern or a disgruntled employee -- could do. That will help you put the appropriate safeguards in place.
Agreed, but the point is, if your system is resilient against an eager intern who has not had the necessary guidance, or an actively hostile disgruntled employee, that inherently restricts the harm an LLM can do.
I'm not making the case that LLMs learn like people. I'm making the case that if your system is hardened against things people can do (which it should be, beyond a certain scale) it is also similarly hardened against LLMs.
The big difference is that LLMs are probably a LOT more capable than either of those at overcoming barriers. Probably a good reason to harden systems even more.
The difference makes the necessary barriers different.
There's benefit to letting a human make and learn from (minor) mistakes. There is no such benefit accrued from the LLM because it is structurally unable to.
There's the potential of malice, not just mistakes, from the human. If you carefully control the LLMs context there is no such potential for the LLM because it restarts from the same non-malicious state every context window.
There's the potential of information leakage through the human, because they retain their memories when they go home at night, and when they quit and go to another job. You can carefully control the outputs of the LLM so there is simply no mechanism for information to leak.
If a human is convinced to betray the company, you can punish the human, for whatever that's worth (I think quite a lot in some peoples opinion, not sure I agree). There is simply no way to punish an LLM - it isn't even clear what that would mean punishing. The weights file? The GPU that ran the weights file?
And on the "controls" front (but unrelated to the above note about memory) LLMs are fundamentally only able to manipulate whatever computers you hook them up to, while people are agents in a physical world and able to go physically do all sorts of things without your assistance. The nature of the necessary controls end up being fundamentally different.
A lot of 'agentic harnesses' actually do have limited memory functions these days. In the simplest form, the LLM can write to a file like memory.md or claude.md or agent.md , and this gets tacked on to their system prompt going forwards. This does help a bit at least.
Rather more sophisticated Retrieval Augmented Generation (RAG) systems exist.
At the moment it's very mixed bag, with some frameworks and harnesses giving very minimal memory, while others use hybrid vector/full text lookups, diverse data structures and more. It's like the cambrian explosion atm.
Thing is, this is probabilistic, and the influence of these memories weakens as your context length grows. If you don't manage context properly, (and sometimes even when you think you do), the LLM can blow past in-context restraints, since they are not 100% binding. That's why you still need mechanical safeguards (eg. scoped credentials, isolated environments) underneath.
and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.
If a junior fucks production that will have extroadinary weight because it appreciates the severity, the social shame and they will have nightmares about it. If you write some negative prompt to "not destroy production" then you also need to define some sort of non-existing watertight memory weighting system and specify it in great detail. Otherwise the LLM will treat that command only as important as the last negative prompt you typed in or ignore it when it conflicts with a more recent command.
> and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.
The LLM did have this capability at training time, but weights are frozen at inference time. This is a big weakness in current transformer architectures.
Yup, and the agent will happily ignore any and all markdown files, and will say "oops, it was in the memory, will not do it again", and will do it again.
Humans actually learn. And if they don't, they are fired.
To me it sounds like a tooling problem. OP seems to be trying to use probabilistic text systems as if they enforce rules, but rule enforcement should really live outside the model. My sense is that there was a failure to verify the agent's intent.
The tooling that invokes the model should really define some kind of guardrails. I feel like there's an analogy to be had here with the difference between an untyped program and a typed program. The typed program has external guardrails that get checked by an external system (the compiler's type checker).
What tooling? It's a probabilistic text generator that runs in a black box on the provider's server. What tooling will have which guardrails to make sure that these scattered markdown files are properly injected and used in the text generation?
That's the million dollar question. Maybe have systems of agents that all validate each other's work? Maybe something needs to be done at the harness level? I don't suppose that we could realistically expect 100% accuracy, but if we take 100% to be the upper limit, we could build systems that get us closer to that ideal.
No no, that’s not what I’m saying. The fact that the data is stored in files is incidental. It could be in a database, in a knowledge graph, derived from so other data Regardless of where it is, something should know to include it in the context, but only when it’s relevant.
So for instance you could start by trying to classify the prompt in some way. If you use an LLM for this, you might need to get it to return a machine parsable data format. Then your harness can pattern match on the classification and use it to enrich the prompt with additional context. The challenge would be in determining how exactly you want to go about this, balancing tradeoffs such as accuracy, cost, time, etc..
For the classification step you might begin with something like "Determine whether the following prompt is a QUESTION or a STATEMENT. Respond using only one of the two words. Prompt: $PROMPT"
You could have multiple back-and-forths like this and at each round you gain more information about the prompt, and you can use that information to determine further classifications and/or context to include.
> Regardless of where it is, something should know to include it in the context,
Magic. You're talking about magic. You keep re-iterating the same faith that "There's some magic way to make probabilistic text generator running in the cloud to never miss local files", where "files" is "files, knowledge graphs, databases etc.".
It doesn't matter how data is stored. You can't know when to include something relevant in the context because the whole thing including context is running in the cloud. You are not in the driver's seat. Literally anything you include locally in the prompt can and will be ignored.
I think you are more right than people are giving you credit for. I would love to see the full transcript to understand the emotional load of the conversation. Using instructions like "NEVER FUCKING GUESS!" probably increase the likelihood of the agent making a "mistake" that is destructive but defensible.
"Emotional" response is muted through fine-tuning, but it is still there and continued abuse or "unfair" interaction can unbalance an agents responses dramatically.
An eager intern can not be working for hundreds of millions of customers at the same time. An LLM can.
A disgruntled employee will face consequences for their actions. No one at Anthropic, OpenAI, xAI, Google or Meta will be fired because their model deleted a production database from your company.
It is merely a simulacrum of an intern or disgruntled employee or human. It might say things those people would say, and even do things they might do, but it has none of the same motivations. In fact, it does not have any motivation to call its own.
That's fair, largely because an LLM is a lot more capable at overcoming restrictions, by hook or by crook as TFA shows. However, most systems today are not even resilient against what humans can do, so starting there would go a long way towards limiting what harms LLMs can do.
it cannot go to the washroom and cry while pooping. And thats just one of the things that any human can do and AI cannot. So no it cannot do anything a human can do, the shared exmaple being one of them.
And thats why we dont have AI washrooms because they are not alive or employees or have the need to excrete.
So it's only wrong in a technical and pedantic sense. A better phrasing might have been along the lines of "There are many sequences of tokens that will destroy your production database that are within the set of possible outputs"
"Everything that can go wrong, will go wrong" isn't literally true either, some failure modes are mutually exclusive so at most one of them will go wrong. I think that the punchy phrasing and the mental model are both more useful from the standpoint of someone creating/managing agents and that it is true in the sense that any other mental model or rule of thumb is true. It's literally true among spherical cows in a frictionless vacuum and directionally correct in the real world with it's nuances. And most importantly adopting the mental model leads to better outcomes.
But it may be a bad mental model in other contexts, like debugging models. As an extreme example models is that collapse during training become strictly deterministic, eg a language model that always predicts the most common token and never takes into account it's context.
It's an objection to adding a new dependency, not an attempt to remove an existing one. If we can't stop adding new dependencies, we are certain to be stuck with the status quo forever.
It's almost as if comments on a website called "hacker news" are written by individuals with differing and varying opinions, and not by some nebulous hive-mind that purports to be internally consistent.
The problem is that the well you are drinking from has in fact been poisoned. Maybe you think you can tolerate it but some projects are taking a policy decision that any exposure is too dangerous and that is IMO perfectly reasonable.
The fact that the AI agent will just go and attempt to do whatever insane shit I can dream up is both the most fun thing about playing with it, and also terrifying enough to make me review its output carefully before it goes anywhere near production.
(Hot take: If you're not using --dangerously-skip-permissions, you don't have enough confidence in your sandbox and you probably shouldn't be using a coding agent in that environment)
That Terraform blast radius is exactly the problem I'm building Daedalab around: agents need hard approvals, scoped permissions, and an audit trail before prod is even reachable. If you're curios: www.daedalab.app
Hot take indeed. Unfortunately it's too blunt an instrument. I can't control "you may search for XYZ about my codebase but not W because W is IP-sensitive". So, to retain Web Search / Web Fetch for when it's useful, all such tool uses must be reviewed to ensure nothing sensitive goes outside the trust boundary.
Yes, I'm aware this implies differing levels of trust for data passing through Claude versus through public search. It's okay for everyone to have different policies on this depending on specific context, use-case and trust policies.
From the article, it sounds like that engineer did a lot of other reckless things even before handing the tasks over to the AI agent to continue the recklessness with even more abandon.
This is a case study in "if you don't know what you're doing, the answer is not just to hand it over to some AI bot to do it for you."
The answer is to hire a professional. That is if you care about your data, or even just your reputation.
> before handing the tasks over to the AI agent to continue the recklessness with even more abandon
Which is a funny outcome of this because apparently the AI agent (Claude) tried to talk him out of doing some of the crazy stuff he wanted to do! Not only did he make bad decisions before invoking the AI, he even ignored and overruled the agent when it was flagging problems with the approach.
> Unless your strategy is to create a photo-lab-like screen in pure black and red, or wear deep-red-tinted glasses, it’s unlikely that a pure colorshift strategy will cut out that big of a chunk of the spectrum.
The writer is dismissing this out of hand but to me this sounds like a great idea.
> And I'm guessing that the reason macOS doesn't give more details is because macOS is likely not involved in the step that fails
And I guess because of the wide variety of third-party hardware macOS has to support, it's not practical to write a pre-flight check into the update process either.
reply