They don’t want to officially disclose the reality because while some users will understand the realities of protecting a product while innovating, many will just realize it means one can go looking for claude 4.5 performance elsewhere.
Regret? Of what? The tech is here. You won't slow it down by not using it. People need to either adapt by moving to more and more niche areas, or become the person to be retained when the efficiency gains materialize. We still don't have the proper methodology figured out, but people are working on it.
That said, I'd agree that people who currently claim 20x speedups will indeed be replaced.
It's because of simple FOMO of companies. If they don't "invest" in it they will be left behind. Which is true. However, the way they invest is equally (if not more) important. E.g. MS is a good example of how not to do it.
It's failing when there is no data in the training set, and there are no patterns to replicate in the existing code base.
I can give you many, many examples of where it failed for me:
1. Efficient implementation of Union-Find: complete garbage result
2. Spark pipelines: mostly garbage
3. Fuzzer for testing something: half success, non-replicateable ("creative") part was garbage.
4. Confidential Computing (niche): complete garbage if starting from scratch, good at extracting existing abstractions and replicating existing code.
Where it succeeds:
1. SQL queries
2. Following more precise descriptions of what to do
3. Replicating existing code patterns
The pattern is very clear. Novel things, things that require deeper domain knowledge, coming up with the to-be-replicated patterns themselves, problems with little data don't work. Everything else works.
I believe the reason why there is a big split in the reception is because senior engineers work on problems that don't have existing solutions - LLMs are terrible at those. What they are missing is that the software and the methodology must be modified in order to make the LLM work. There are methodical ways to do this, but this shift in the industry is still in baby shoes, and we don't yet have a shared understanding of what this methodology is.
Personally I have very strong opinions on how this should be done. But I'm urging everyone to start thinking about it, perhaps even going as far as quitting if this isn't something people can pursue at their current job. The carnage is coming:/
Nope. Especially with these agents the thinking trace can get very large. No human will ever read it, and the agent will fill up their context with garbage trying to look for information.
I understand the drive for stabilizing control and consistency, but this ain't the way.
There's a fun hypothesis I've read about somewhere, goes something like this:
As the universe expands the gap between galaxies widens until they start "disappearing" as no information can travel anymore between them.
Therefore, if we assume that intelligent lifeforms exist out there, it is likely that these will slowly converge to the place in the universe with the highest mass density for survival. IIRC we even know approximately where this is.
This means a sort of "grand meeting of alien advanced cultures" before the heat death. Which in turn also means that previously uncollided UUIDs may start to collide.
Those damned Vogons thrashing all our stats with their gazillion documents. Why do they have a UUID for each xml tag??
It is counter intuitive but information can still travel between places that are so distant that expansion between them is faster than the speed of light. It's just extremely slow (so I still vote for going to the party at the highest density place).
We do see light from galaxies that are receding away from us faster than c. At first the photons going in our direction are moving away from us but as the universe expands over time at some point they find themselves in a region of space that is no longer receding faster than c, and they start approaching.
That's not exactly it. Light gets redshifted instead of slowing down, because light will be measured to be the same speed in all frames of reference. So even though we can't actually observe it yet, light traveling towards us still moves at c.
It's a different story entirely for matter. Causal and reachable are two different things.
Regardless, such extreme redshifting would make communication virtually impossible - but maybe the folks at Blargon 5 have that figured out.
I think I missed something: how do galaxies getting further away (divergence) imply that intelligent species will converge anywhere? It isn’t like one galaxy getting out of range of another on the other side of the universe is going to affect things in a meaningful way…
A galaxy has enough resources to be self-reliant, there’s no need for a species to escape one that is getting too far away from another one.
Yes that's the idea. The expansion simply means that the window of migration will close. Once it's closed, your galaxy is cut off and will run out of fuel sooner than the high-density area.
Assuming these are advanced enough aliens, they'll also be bringing with them all the mass they can, to accentuate the effect? I'm imagining things like Niven's ringworld star propulsion.
> Maybe when AIs are able to say: "I don't know how this works" or "This doesn't work like that at all." they will be more helpful.
Funny you say that, I encountered this in a seemingly simple task. Opus inserted something along the lines of "// TODO: someone with flatbuffers reflection expertise should write this". I actually thought this was better than I anticipated even though the task was specifically related to fbs reflection. And it was because I didn't waste more time and could immediately start rewriting it from scratch.
I have the same experience and still use it. It's just that I learned to use it for simplistic work. I sometimes try to give it more complex tasks but it keeps failing. I don't think it's bad to keep trying, especially as people are reporting insane productivity gains.
After all, it's through failure that we learn the limitations of a technology. Apparently some people encounter that limit more often than others.
> I have the same experience and still use it. It's just that I learned to use it for simplistic work.
OP said "I don't see the productivity boost from AI" and that they don't "believe the hype" without any qualification, but then went on to say that they use it every day. This makes no sense to me.
Isn't this like saying "I don't get anything out of reading books" immediately followed by "I read books for 4 hours every night"?
to be fair, every other comment is usually screaming about how if you aren't able to utilize LLMs effectively, you will be without a job soon. most people want to keep their job, or be employable, so if LLMs are a required tool to know, they're trying to become fluent in it by using it.
> to be fair, every other comment is usually screaming about how if you aren't able to utilize LLMs effectively, you will be without a job soon
I think a lot of these "it's all overhyped crap" posts are a hypocritical.
If someone wants to be consistent with their "it's crap" argument, they wouldn't be using it for anything. Period.
If someone says they need it for their job, then they are admitting that it's useful for their job. Because it would otherwise be irrational to use a tool that makes them worse at their job.
Perhaps the out of job prediction is actually reversed. True, LLMs will become an efficiency increasing tool. But in terms of job security, doesn't that mean that if your whole job can be driven by an LLM then demand for that job decreases?
In other words, people claiming these high productivity increases may be the ones at actual risk. Why employ 3 people when 1 can write the prompts?
reply