Hacker News .hnnew | past | comments | ask | show | jobs | submit | mgambati's commentslogin

Gitbutler still a better option than any worktree like variant

GPT 5.5 and 5.4 are such great models. I just tried opus 4.8 and took 30 minutes to be confronted with a bit laziness that makes me go crazy. 5.5 just doesn’t have this issue.

How do you compare them to 5.3 Codex? I am using 5.3 Codex for a while, I subjectively think it does better job than Opus 4.6/4.7, with a fraction of the cost, and I did give 5.5 a try and it seems a bit better but magnitudes more expensive.

5.3 is good but talks like a robot, it’s too hard to understand what exactly it’s talking about. When using droid I use it to act like worker model and does a great job.

All 5.x models suffer from weirdness in the way it writes but 5.5 and 5.4 are much better and now offer a good balance, direct but without being like Claude.


They demoed today 8i running ate 1300 to 1600ish tokens per second. I imagine that is caused by having a single rack serving the model just for the demo.


There's a limit to how much you can "scale" this process, it's linear, but if we did napkin math based on vllm parallel batched streams only lose around ~50% performance compared to single-stream output so doesn't explain the ridicioulusly fast numbers here.

I wish google just came out and told us how large their flash model is, because if it's as big or smaller than gpt-5.4-nano that's the real headline here.


Words on the street is that xAI will buy cursor.


Yeah for 10-60 BILLION. which again makes this even stupider.

For this amount of money you can rebuild cursor and everything else on the market, and with the rest of 9-59 Billion, you just hire experts in coding and let them code real high quality code examples.

And then you just use your existing grok pipeline and just add this functionality.

This xAI stuff has to be run by idiots


Buy "Cursor", not "Cursor's IP". This means brand, users, and a shitton of data.

And if you combine a shitton of data with a lot of compute, large userbase and good engineers, you have a pretty good chance of doing something interesting.


Yeah you know how much 10-60 Billion are?

You could literaly just give your compute away for free for a year to pull people in.

Make an API Endpoint for free with the caviat that they are allowed to use the data for traing, what everyone else does too.


And you still don’t get the quality of data that cursor have which is the best due to being collected pre vibe coding.


With giving out tokens for free you would

The model already advertises because they where trained on massive data’s that refers big brands.

Ask for suggestions for a new pair of shoes. What brand do you think it will suggest Nike, Adidas or some random small one?


I expected the same out come you're saying here, but in my experience this hasn't been the case. I've been researching new acoustic guitars to purchase, and I've been getting an equal amount of suggestions from the major brands and the small brands.

Part of it though is I'm giving lots of context (e.g. guitar player for 10+ years, huge Opeth fan, looking for something with as close to an Ibanez style neck as possible under $1000)


I think guitars market is kind of exception because it is pretty normal for guitar players to search for "guitar like fender but cheaper". There are tons of reddit/forum discussions about this and those small brands are actually very well known in community, because majority of guitar players play on cheap instruments. Youtuber Phillip Mcknight often talked about that cheap guitars move in ridiculous volumes compared to more expensive ones like Gibson or Fender.


I think if you ask something generic like “shoes”, this could be true.

When I’ve worked with Claude on finding brands for fashion (e.g. here’s a small watchmaker I like, what are similar options?) it does research and picks great options. Some are big, others are small producers.


I kind feel the same. I’m learning things and doing things in areas that would just skip due to lack of time or fear.

But I’m so much more detached of the code, I don’t feel that ‘deep neural connection’ from actual spending days in locked in a refactor or debugging a really complex issue.

I don’t know how a feel about it.


I strongly agree on the refactor, but for debugging I have another perspective: I think debugging is changing for the better, so it looks different.

Sure, you don't know the code by heart, but people debugging code translated to assembly already do that.

The big difference is being able to unleash scripts that invalidate enormous amount of hypothesis very fast and that can analyze the data.

Used to do that by hand it took hours, so it would be a last resort approach. Now that's very cheap, so validating many hypothesis is way cheaper!

I feel like my "debugging ability" in terms of value delivered has gone way up. For skill, it's changing. I cannot tell, but the value i am delivering for debugging sessions has gone way up


As someone who's switched from mobile to web dev professionally for the last 6 months now. If you care about code quality, you'll develop that neural connection after some time.

But if you don't and there's no PR process (side projects), the motivation to form that connection is quite low.


> If you care about code quality, you'll develop that neural connection after some time.

No, because you can get LLMs to produce high quality code that has gone through an infinite number of refinement/polish cycles and is far more exhaustive than the code you would have written yourself.

Once you hit that point, you find yourself in a directional/steering position divorced from the code since no matter what direction you take, you'll get high quality code.


Only if never find opportunities to simplify the code it's writing and you don't review the code at all.

> no matter what direction you take, you'll get high quality code

This is not the case today. You get medium-quality, sometimes over-engineered code 10x faster.


Funny if you open their website and go to April 2026 you literally see this: 26b revenue (Anthropic beat 30b) + pro human hacking (mythos?).

I don’t think predictions, but they did a great call until now.


I agree that they called many things remarkably well! That doesn't change the fact that AI 2027 is not a thing which happened, so it isn't valid to point out "this killed us in AI 2027." There are many reasons to want to preserve CoT monitorability. Instead of AI 2027, I'd point to https://arxiv.org/html/2507.11473.


It’s funny how you train a machine to mimic human behavior then marketing team decides to promote it “Look! It’s human! Look how it thinking about existence!” while a huge percentage of humanity produced content is exactly about the uncertainty of human existence and that got used to train the model.


I see us collectively forgetting the training process as time goes on, and I think that explains why people get so surprised by some pretty obvious outcomes of said training. Perhaps also why people keep anthropomorphising these outcomes.


Is composer 2 any good? Can it be compared to opus ou gpt 5.4?


No it's not very good. But when you run out of Claude tokens it's perfectly fine for small stuff.

Cursor's inline autocomplete is very good though, much better than anything I could reproduce in Zed with various 3rd party "edit" LLMs (although checking google, they announced a new model since I tried it https://zed.dev/blog/zeta2)


I ran parallel prompt with composer 2 and gpt5.3 codex. Composer did slightly better, in terms of variable naming and extra tweaks to loosely related files to keep the codeb consistent.


They can try add as much as friction they want. A simple rename in the files and directories like .claude makes the thing work to move out of CC.

It’s not like moving from android to iOS.


You'd be surprised how effective small bits of friction are.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: