Nitpick/question: the "LLM" is what you get via raw API call, correct?
If you are using an LLM via a harness like claude.ai, chatgpt.com, Claude Code, Windsurf, Cursor, Excel Claude plug-in, etc... then you are not using an LLM, you are using something more, correct?
An example I keep hearing is "LLMs have no memory/understanding of time so ___" - but, agents have various levels of memory.
I keep trying to explain this in meetings, and in rando comments. If I am not way off-base here, then what should be the term, or terms, be? LLM-based agents?
> Nit pick/question: The LLM is what you get via raw API call, correct?
You always need a harness of some kind to interact with an LLM. Normal web APIs (especially for hosted commercial systems) wrapped around LLMs are non-minimal harnesses, that have built in tools, interpretation of tool calls, application of what is exposed in local toolchains as “prompt templates” to transform the context structure in the API call into a prompt (in some cases even supporting managing some of the conversation state that is used to construct the prompt on the backend.)
> If you are using an LLM via a harness like claude.ai, chatgpt.com, Claude Code, Windsurf, Cursor, Excel Claude plug-in, etc... then you are not using an LLM, you are using something more, correct?
You are essentially always using something more than an LLM (unless “you” are the person writing the whole software stack, and the only thing you are consuming is the model weights, or arguably a truly minimal harness that just takes setting and a prompt that is not transformed in any way before tokenization, and returns the result after no transformations or filtering other than mapping back from tokens to text.)
But, yes, if you are using an elaborate frontend of the type you enumerate (whether web or CLI or something else), you are probably using substantially more stuff on top of the LLM than if you are using the providers web API.
In meetings, I try to explain the roles of system prompts, agentic loops, tool calls, etc in the products I create, to the stakeholders.
However, they just look at the whole thing as "the LLM," which carries specific baggage. If we could all spread the knowledge of what is actually going on to the wider public, it would make my meetings easier, and prevent many very smart folks who are not practitioners from saying inaccurate stuff.
If we could all spread the knowledge of what is actually going on to the wider public, it would make my meetings easier, and prevent very smart folks from outside the field from saying dumb-sounding stuff.
This is an example of why LLMs won't displace engineers as severely as many think. There are very old solved processes and hyper-efficient ways of building things in the real world that still require a level of understanding many simply don't care or want to achieve.
I like to use the term "coding agents" for LLM harnesses that have the ability to directly execute code.
This is an important distinction because if they can execute the code they can test it themselves and iterate on it until it works.
The ChatGPT and Claude chatbot consumer apps do actually have this ability now so they technically class as "coding agents", but Claude Code and Codex CLI are more obvious examples as that's their key defining feature, not a hidden capability that many people haven't spotted yet.
You're not off-base at all. The way I think about it:
- LLM = the model itself (stateless, no tools, just text in/text out)
- LLM + system prompt + conversation history = chatbot (what most people interact with via ChatGPT, Claude, etc.)
- LLM + tools + memory + orchestration = agent (can take actions, persist state, use APIs)
When someone says "LLMs have no memory" they're correct about the raw model, but Claude Code or Cursor are agents - they have context, tool access, and can maintain state across interactions.
The industry seems to be settling on "agentic system" or just "agent" for that last category, and "chatbot" or "assistant" for the middle one. The confusion comes from product names (ChatGPT, Claude) blurring these boundaries - people say "LLM" when they mean the whole stack.
Percentages might not align with actual usage patterns. If anyone can come up with a better framing, please create a new poll. I just want to know response data.
> Theme park simulation game made with GPT‑5.4 from a single lightly specified prompt, using Playwright Interactive for browser playtesting and image generation for the isometric asset set.
Is "Playwright Interactive" a skill that takes screenshots in a tight loop with code changes, or is there more to it?
> "How do we protect ourselves against a competitor doing this?"
I have been thinking about this a lot lately, as someone launching a niche b2b SaaS. The unfortunate conclusion that I have come to is: have more capital than anyone for distribution.
Is there any other answer to this? I hope so, as we are not in the well-capitalized category, but we have friendly user traction. I think the only possible way to succeed is to quietly secure some big contracts.
I had been hoping to bootstrap, but how can we in this new "code is cheap" world? I know it's always been like this, but it is even worse now, isn't it?
Based on my own lived bias and HN addiction... The leading demographic has to be West Coast of the USA, then the multinational PNW as the most active HN users.
Again, I am biased, but as far as HN bias goes, that ain't bad. These are amazing places chock-full of the coolest people.
After decades of the West Coast life I now live in rest of world, and wish that rest of world could be so lucky.
If you are using an LLM via a harness like claude.ai, chatgpt.com, Claude Code, Windsurf, Cursor, Excel Claude plug-in, etc... then you are not using an LLM, you are using something more, correct?
An example I keep hearing is "LLMs have no memory/understanding of time so ___" - but, agents have various levels of memory.
I keep trying to explain this in meetings, and in rando comments. If I am not way off-base here, then what should be the term, or terms, be? LLM-based agents?
reply