HN2new | past | comments | ask | show | jobs | submit | Joeri's commentslogin

That is not a kind judgment. While I hesitate at the word “dictatorship” it is fair to say that society puts more value on extrovert interaction than introvert contemplation, and it does this because extroverts dominate the social conversation.

I already switched to claude a while ago. Didn’t bring along any context, just switched subscriptions, walked away from chatgpt and haven’t touched it again. Turned out to be a non-event, there really is no moat.

I switched not because I thought Claude was better at doing the things I want. I switched because I have come to believe OpenAI are a bad actor and I do not want to support them in any way. I’m pretty sure they would allow AGI to be used for truly evil purposes, and the events of this week have only convinced me further.


Yesterday was my first time trying it. One thing that felt a bit strange to me was that I asked it something and the response was just one paragraph. Which isn't bad or anything but it felt... strange? Like I always need to preface ChatGPT/gemini/whatever question with "Briefly, what is..." or it gives me enough fluff to fill a 5 page high school essay. But I didn't need to do that and just got an answer that was to the point and without loads of shit that's barely related.

And the weirdest thing that I noticed: instead of skimming the response to try finding what was relevant, I just straight up read it. Kind of felt like I got a slight amount of focus ability back.

Accuracy is something I can't really compare yet (all chatbots feel generally the same for non-pro level queries), but so far, I'm fairly satisfied.


I use Gemini all the time, but I have to say it's got verbal diarrhea and an EXTREMELY annoying trait of wanting to lead the conversation rather than just responding to what YOU want to do. At the end of every response Gemini will always suggest a "next step", in effect trying to 2nd guess where you want the conversation to go. I'd much rather have an AI that just did what it was asked, and let me decide what to ask next (often nothing - maybe it was just a standalone question!).

Apparently this annoying "next step" behavior is driven by the system prompt, since the other day I was running Gemini 3 Thinking, and it was displaying it's thoughts which included a reminder to itself to check that it was maintaining a consistent persona, and to make sure that it had suggested a next step. I'd love to know the thought process of whoever at Google thought that this would make for a natural or useful conversation flow! Could you imagine trying to have a conversation with a human who insisted on doing this?!


Yes. That is a salesperson. The next-step is to drive engagement. I know you're not interested in $THiNG now but can I follow up in 3 months?

In persona I think the AIs are that Claude is the engineer and Gemini is the sales-person and GPT is the eager and loud journeyman.


I don’t know, I’ve found the follow ups nice sometimes. You can just ignore them if they’re not useful. The computer won’t get mad…

I find they are extremely rarely useful - they just break my own chain of thought by having to constantly read and ignore this stuff. The same goes for the excessively verbose responses too - a human has a limited "context window" and a shorter response is therefore much more useful/valuable than a long token-maxxed one.

Sure the computer won't get mad, and that is all I do - just ignore what Gemini is suggesting and pretend it never said it - but it certainly makes me mad. The main reason I stick with Gemini is because of the generous free usage limits, but I know this annoying "next step" behavior (which was a relatively recent change) is going to push me back to Claude, even if I need to pay for it.


Yes. Gemini is mimicking its creators: this is exactly how I experienced speaking to most Googlers.

One issue is that Claude’s web search abilities are more limited, for example it can’t search Reddit and Stack Overflow for relevant content.

Why not just write a skill and script that calls crawl4ai or similar and do this using Claude code?

You can store the page as markdown for future sessions, mash the data w other context, you name it.

The web Claude is incredibly limited both in capability and workflow integration. Doesn’t matter if you’re dealing with bids from arbor contractors or researching solutions for a DB problem.


Want this w/o killing the free open web.

Maybe I run an old PC adjacent to the scraper to manually visit the scraped pages without an adblocker, & buy something I need from an ad periodically (while a cohesive response is being generated in the meantime)

Ya sounds dumb, wishing for a middle ground that lets us be effective but also good netizens. Maybe that Cloudflare plan to charge the bots…



That's so frustrating with Claude. If I need to widely search the web or if I need it to read a specific URL I pasted, I always turn to ChatGPT. Claude seems to hit a lot more roadblocks while trying to navigate the web.

The issue is Reddit though. They're the ones blocking. They're very aggressive.

When sites are working in one chatbot and not another, there's a good chance that the latter is respecting the website rules. As an example with Reddit, you're probably blocked when using a VPN like Mullvad


They're playing too nice. It's time to roll out the residential proxies.

It's not that hard to roll your own web search MCP.

I made one for Crush a while ago.

https://anduil.neocities.org/blog/?page=mcp

I'm not sure about the issues with reddit though? Do they block Claude's web fetch tool? I think Codex runs it thru some kind of cache proxy.


Rolling your own is not the solution for the common case where you’re asking an LLM a question that may or may not be supported or supplemented by a web search. ChatGPT decides by itself when and how to consult the web, and then links the relevant sources in its result. You don’t get that functionality from Claude chat, you’d have to completely build your own chat harness and apps.

Sites like Reddit are blocking AI providers, they have to have some contract with them for access. OpenAI does seem to have that.


That’s a feature not a bug.

Heh, a while ago I wondered why ChatGPT had started to reply tersely, almost laconically. Then I remembered that I had explicitly told it to be brief by default in the custom personality settings… I also noticed that there are now various sliders to control things like how many emojis or bulletpoint lists ChatGPT should use, which I though was amusing. Anyway, these tools can be customized to adopt just about any style, there's no need to always prefix questions with "Briefly" or similar.

Here's my prompt to make ChatGPT sound more like Claude.

It works but not as well as I'd like -- the tone and word choice still ends up being really jarring to me (even after years of using ChatGPT). Maybe that's promptable too. Open to suggestions.

---

Respond in a natural conversational style. In terms of language, match my own tone and style.

Keep responses to half a page or so max. (Use context and your judgment. e.g. for example, initial response can be a page, and then specific follow up questions can be shorter, if the question is answered clearly)

Prefer minimal formatting. Don't use headings, lists etc. Bold and italics OK but keep it tasteful.

If you're starting a paragraph like so

Item name: description..

then it makes sense to bold item name for readability purposes.


Hah. I remember some story about a chatbot that had been trained on slack conversations. You would ask it for an essay on whatever, and it would say "will do" or "I'll have it for you tomorrow." :)

Yeah, I've always been a little confused why people use ChatGPT so heavily. It's better than it used to be (maybe thanks to custom configuration), but it still tends to respond like it's writing a Wikipedia article.

Wikipedia articles on demand are great, but not usually what I want.


Yep the experience is quite something. Another thing I've noticed, and you likely soon will also, is that Claude only attempts a follow-up if the one is needed or the prompt is structured for it. Meanwhile ChatGPT always prompts you with a choice of next steps. It can be nice, as sometimes the options contain improvements you never thought of and would like, but in lengthy conversations with a detailed plan it does things really piecemeal, as though trained to maximize engagement instead of getting to a final solution.

I find that Claude almost always ends its response with some sort of follow up question, despite my system prompt telling it not to.

I never really used ChatGPT much though so maybe Claude is just relatively less egregious?


> Which isn't bad or anything but it felt... strange?

On the contrary, it's great. It's fully capable of outputting a wall of text when required, so instead of feeling like I'm talking to something that has a minimum word count requirement, I get an appropriate sized response to the task at hand.


That tracks for me; longtime claude, claude code pro subscriber (not all of it has been good - but that's neither here nor there).

Over the last few iterations of Sonnet and Opus, anthropic has definitely trained me to ask it to explain something "in detail" (or even "in great detail") when I want as much nuance as possible.

It used to be the inverse - way too much detail when I didn't want it.


In my limited experience, that's mostly since the 4.6 release. I noticed that with the same prompt, it answers much more briefly. A bit jarring indeed, but I prefer it. Less bs and filler, and less burning off electricity for nothing.

This behavior first appeared in 4.5, mostly for specific types of questions and in "natural conversation" workflows. 4.6 might have pushed it further.

It’s probably an offshoot of making Claude more and more suitable for code/cowork.

> there really is no moat.

For ChatGPT and Gemini, yes.

But for Claude, they have a very deep & big one: Its the only model that gets production ready output on the first detailled prompt. Yesterday I used my tokens til noon, so I tried some output from Gemini & Co. I presented a working piece of code which is already in production:

1. It changed without noticing things like "Touple.First.Date.Created" and "Touple.Second.Date.Created" and it rendered the code unworking by chaning to "Touple.FirstDate" and "Touple.SecondDate"

2. There was a const list of 12 definitions for a given context, when telling to rewrite the function it just cut 6 of these 12 definitions, making the code not compiling - I asked why they were cut: "Sorry, I was just too lazy typing" ?? LOL

3. There is a list include holding some items "_allGlobalItems" - it changed the name in the function simply to "_items", code didnt compile

As said, a working version of a similar function was given upfront.

With Claude, I never have such issues.


I have used Claude (incl. Opus 4.6) fairly extensively, and Claude still spits out quality that is far below what I would call production ready - both littered with smaller issues, but also the occasional larger blunder. Particularly when doing anything non-trivial, and even when guiding it in detail (although that admittedly reduces the amount of larger structural issues).

Maybe it is tech stack dependent (I have mostly used it with C#/.NET), but I have heard people say the same for C#. The only conclusion I have been able to draw from this, is that people have very different definitions of production ready, but I would really like to see some concrete evidence where Claude one-shots a larger/complex C# feature or the like (with or without detailed guidance).


> C#/.NET

same here :)

> one-shots a larger/complex C# feature

I can show you a timeseries data-renderer which was created with 1 initial very large prompt and then 3 following "change this and that" prompts. The file is around 5000 lines and everything works fine & exactly as specified.


> The file is around 5000 lines

Yep, this is another case of different standards for "production ready."


Caught, good one! :-))

++1


Feel free to share it, would be very curious - ideally alongside the prompts.

Do you have an email address?

You can use this: hnthrowaway.outboard407@passmail.net

Sent

I see this over and over again. I don't dispute your experience. My experience with ESP32 development has been unreasonably positive. My codebase is sitting around 600k LoC and is the product of several hundred Opus 4.x Plan -> Agent -> Debug loops. I review everything that goes through, but I'm reviewing the business logic and domain gotchas, not dumb crap like what you and so many others describe.

What is so strange to me is that surely there is more C# out there than ESP-IDF code? I don't have a good explanation beyond saying that my codebase is extensively tested and used; I would know very quickly if it suddenly started shitting the bed in the way you explain.


600k lines of code for anything on the ESP32 sounds like the absolute polar opposite of “good”

Tell us you've never built anything significant without telling us?

The more code is out there, the worse is the average in the training dataset. There will be legacy approaches and APIs, poor design choices, popular use cases irrelevant for your context etc that increase the chances of output not matching your expectations. In Java world this is exactly how it works. I need 3-5 iterations with Claude to get things done the way I expect, sometimes jumping straight to manual refactoring and then returning the result to Claude for review and learning. My CLAUDE.md (multiple of them) are growing big with all patterns and anti-patterns identified this way. To overcome this problem model needs specialized training, that I don‘t think the industry knows how to approach (it has to beat the effort put in the education system for humans).

I also believe this must be true. Try asking Claude to program in Forth, I find the results to be unreasonably good. That's probably because most of the available Forth to train on is high quality.

> To overcome this problem model needs specialized training, that I don‘t think the industry knows how to approach

We already have coding tuned models i.e. Codex. We should just have language / technology specific models with a focus on recent / modern usage.

Problem with something like Java is too old -- too many variants. Make a cut off like at least above Java 8 or 17.


> We should just have language / technology specific models with a focus on recent / modern usage.

The “just” part is a big assumption. It is far from easy, given that modern best practices are always underspecified. The effective model for coding must have reasoning signals to be much stronger than coding patterns, and that, I suspect, requires very different architecture.


> My experience with ESP32 development has been unreasonably positive. My codebase is sitting around 600k LoC and is the product of several hundred Opus 4.x Plan -> Agent -> Debug loops.

I feel like this is an example of people having different standards of what “good” code is and hence the differing opinions of how good these tools are. I’m not an embedded developer but 600K LOC seems like a lot in that context, doesn’t it? Again I could be way off base here but that sounds like there must be a lot of spaghetti and copy-paste all over the codebase for it to end up that large.


I don't think it's that large. Keep in mind embedded projects take few if any dependencies. The standard library in most languages is far bigger than 600k loc.

I work with ESP32 devices and 600k lines of code is insane.

I'm curious: What does this device do?

It's wild to come back to this after a day away and have the takeaway from my attempt to answer the question with punditry about the size of my codebase from people who don't have any idea what my device does.

Answering this question directly puts me in an awkward spot because I realized last fall that there was absolutely no way that I could talk about what I'm working on in a way that can be associated with my product because there's so much anti-AI activism right now. That sucks, because I'd like to be "loud and proud" but I have a family to feed. I strongly suspect that versions of my story are playing out for hundreds of entrepreneurs right now.

Here's what I can describe: it's an ESP32-P4 based consumer device with about 45 ESP-IDF components that all communicate over an event bus. There's a substantially modified LVGL front-end with a 3D rendering engine and SVG-like 2D animation in front of a driver for a customized variation of the ST7789. There is substantial custom code for both USB host and client functions across various modes of operation. There's custom drivers for several sensors and haptic feedback. There's a very elaborate menu UI system which is also backed by a BBS style terminal configuration system for power users. There's an assignable action system with about 40 actions that all have their own state machines and a lot of mutex locking. There's a very involved and feature-dense trigger scheduling system. There's a very flexible data stream routing matrix. There's a full suite of command line scripts for most functions. There's a self-hosted web app for configuration that also implements a screen share functionality via an HTML canvas object so that I can record videos of what's happening on the device with OBS without having to point a DSLR at it from a gantry.

Honestly, I could go on and on, but all of the people who think that 600kloc is a lot [sight unseen] are following YouTube tutorials and can eat me.

I responded to you because you asked politely. I hope it was an interesting reply.


Interesting - what kind of structural issues have you encountered?

Is these more related to the existing source code or is this a bad pattern thar you would never do regardless of the existing code?


I don't get it though. Why do you expect perfect responses? Humans continually make mistakes, and AI is trained on human data. Yet there seems to be this higher bar of expectation for the latter. Somehow people expect this thing that's been around for a few weeks/months, and cannot learn anything more beyond its training cutoff date, to always do a better job than a human who's been around for 20+ years and is able to learn on their own until death.

I don't expect that - am merely responding to the parent comments claim that Claude consistently one-shots production ready code (which does not at all match my observations).

> Its the only model that gets production ready output on the first detailled prompt. Yesterday I used my tokens til noon, so I tried some output from Gemini & Co. I presented a working piece of code which is already in production:

One does often hear that where LLMs shine is with greenfield code generation but they all start to struggle working with pre-existing code. It could be that this wasn't a like for like comparison.

That said I do personally feel Claude to produce far better results than competitors.


> One does often hear that where LLMs shine is with greenfield code generation but they all start to struggle working with pre-existing code. It could be that this wasn't a like for like comparison.

In my experience working in a large codebase with a good set of standards that's not the case, I can supply examples already existing in the codebase for Claude to use as a guidance and it generates quite decent code.

I think it's because there's already a lot of decent code for it to slurp and derive from, good quality tests at the functional level (so regressions are caught quickly).

I do understand though that on codebases with a hodge podge of styles, varying quality of tests, etc. it probably doesn't work as well as in my experience but I'm quite impressed about how I can do the thinking, add relevant sections of the code to the context (including protocols, APIs, etc.), describe what I need to be done, and get a plan back that most times is correct or very close to correct, which I can then iterate over to fix gaps/mistakes it made, and get it implemented.

Of course, there are still tasks it fails and I don't like doing multiple iterations to correct course, for those I do them manually with the odd usage here and there to refactor bits and pieces.

Overall I believe if your codebase was already healthy you can have LLMs work quite well with pre-existing code.


> One does often hear that where LLMs shine is with greenfield code generation but they all start to struggle working with pre-existing code.

Don't we all?


Whether we do or not it's besides the point. The comparison was between Claude, which produced competent greenfield code, and Gemini which struggled with brownfield. The comparison is stacked in Claude's favour.

I'm better at pre-existing code, if only because empty text files give me writers block.

Nope.

Greenfield implementation is not flawless as well.

The only sources of these “it works flawlessly” I know of are:

- literal Claude ads I see online

- my underperforming coworkers whose code I’ve had to cleanup and know first hand that no, it wasn’t flawless

This kind of sentiment is gaslighting CTOs everywhere though. Very annoying.


"better at" != "flawless"

That's been my experience too. I'm using the recent free trial of OpenAI Plus to vibe code, and from this I would say that if Claude Code is a junior with 1-3 years of experience, OpenAI's Codex is like a student coder.

Does it depend on what type of programming you do? Doing Swift/SwiftUI work, I have exactly the opposite experience. I’ve been using both recently, and I want to use Claude alone (especially after the last week’s events), but Codex is just so much faster and better.

Swift/SwiftUI are two of the three experimental projects I'm using Codex on, the other is a physics simulation in python.

It keeps trying to re-invent the wheel, does a bad job of it.

The physics sim was supposed to be a thin wrapper around existing libraries, but instead of that it tried to write all the simulation code itself as a "fallback" (but it was broken), and never actually installed the real simulators that already did this stuff despite being told to use them in the first place. The last few dozen(!) prompts from me have been pairs of ~["Find all cases where you've re-invented the wheel, add them to the planning document", "now do them"]. And it's still not finished removing the original nonsense, so far as I can tell.

One of the two Swift experiments is just a dice roller, it took about 10 rounds of non-compiling metal shaders (I don't know metal, which is why I didn't give up and do that by hand after 4) before I managed to get that to work, and when it did work it immediately broke it again on the next four rounds. It wrote its own chart instead of using Swift Charts, and did it badly. It tried to put all the hamburger menu options into a UIAlertController. Something blocks the UI for several seconds when you change the dice font. I didn't count how many attempts it took to correctly label the D4.

The other Swift experiment was a musical instrument app, that got me to the prototype stage, eventually, but in a way that still felt like a student's project rather than a junior's project.


> Find all cases where you've re-invented the wheel

Did you put in the original prompt the "wheels" you wanted it to use? It's a toss-up when you aren't very specific about what you want.


For the swift apps, at least half of the errors are of a type where I wouldn't expect to have needed to tell someone to not do it like that, and only a student could reasonably be expected to not know better.

For the python physics sim, step 1 was to generate the plan, the prompt included "I want actual plasma physics, including high-density, high-field regimes, externally applied fields, etc., so consider which FOSS libraries would suit this.", and then it proceeded itself to choose some existing libraries, and I made sure those specific named FOSS libraries actually ended up in the plan.

My first clue this wasn't going to work was that even from step 1 it was pushing for writing all the simulation code and not actually using e.g. WarpX despite that it itself had suggested WarpX. In fact, even when WarpX was in the plan, it was "integrate" rather than "just use this from the get-go".

I may well throw the whole thing out and try again with Claude when this trial expires. Most of the runs have been comically non-physical, to the extent you don't even need a physics degree to notice, or even a physics GCSE.


Definitely give Claude a go. I've been nothing but impressed by the performance thus far. My only gripe is the usage limit that I keep hitting.

Already have done, hence earlier comment: https://hackernews.hn/item?id=47204959

Claude made far fewer mistakes in general, never gave me non-compiling code.


(Just outside edit window, I now realise I was ambiguous in this comment, it was more like "Find all cases where you've re-invented the wheel, add their removal to the planning document")

I find it very much matters. I find Gemini better for pretty frontends, Claude opus for planning. Gemini and opus for code reviews. Codex is great when I want the LLM do follow instructions more strictly- good if you already have a detailed design.

Definitely depends on your use.


Do you use GPT-5.3-Codex Extra High or another model?

> Its the only model that gets production ready output on the first detailled prompt.

That's, just, like, your opinion, man.


...and of a lot of colleagues in and out of my sector :)

> But for Claude, they have a very deep & big one: Its the only model that gets production ready output on the first detailled promp

That's not a moat though. Claude itself wasn't there 6 months ago and there's no reason to think Chinese open models won't be at this level in a year at most.

To keep its current position Claude has to keep improving at the same pace as the competitor.


I wrote off ChatGPT/OpenAI because of Sam Altman and those eyeball scan things - so sort of even before all this was a rage and centre stage. Sometimes it's just the gut feeling, and while it may not always be accurate, if something doesn't "feel" right, maybe it is not right. No one else is all good either, but what I mean to say is there are some entities/people who repeatedly don't feel right, have things attached to them that never felt right, etc., and you get a combined "gut feeling". At least that's how it was for me.

I love no mote!

One day I'd like to create a server in my basement that just runs a few really really nice models, and then get some friends and CO workers to pay me $10 a month for unlimited access.

All with the understanding that if you hog the entire server I'm going to kick you off, and if you generate content that makes the feds knock on my door I'm turning over the server logs and your information. Don't be an idiot, and this can be a good thing between us friends.

It would be like running a private Minecraft server. Trust means people can usually just do what they want in an unlimited way, but "unlimited" doesn't necessarily mean you can start building an x86 processor out of redstone and lagging the whole server. And you can't make weird naked statues everywhere either.

Usually these things aren't issues among a small group. Usually the private server just means more privacy and less restriction.


Amazing how analogous this is to the early Internet when people started running web servers out of their basement and then eventually graduated up to being their own dial-in ISP…

> I’m pretty sure they would allow AGI to be used for truly evil purposes

It's perfectly possible that 'truly evil purposes' were the goal all along. Slogans and ethics departments are mere speed bumps on the way to generational wealth.


I know this is necessarily a very unpopular opinion however.

I think HN in particular as a crowd are very vulnerable to the halo effect and group think when it comes to Anthropic.

Even being generous they are only very minimally a "better actor" than OpenAI.

However, we are so enthralled by their product that we tend to let the view bleed over to their ethics.

Saying we want out tools used in line with the US constitution within the US on one particular point. Is hardly a high moral bar, it's self preservation.

All Anthropic have said is:

1. No mass domestic surveillance of Americans.

2. No fully autonomous lethal weapons yet.

My goodness that's what passes for a high moral standard? Really anything that doesn't hit those very carefully worded points is not "evil"?


Lets generalise a bit more here - every company at any time could completely heel-turn and do awful things. Even my favourite private companies (e.g. Valve) have done things that I would consider evil.

However, I would think I'm not alone in that I'm generally wanting to do good while also wanting convenience, I know that really every bit of consumption I do is probably negative in some ways, and there is no real "apolitical" action anyone can take.

But can't I at least get annoyed and take my money somewhere else for the short amount of time another company is doing it better?

Yes, if openAI suddenly leaps forwards with codex and pounds anthropic into the dust, I'll likely switch back despite my moral grievances, but in a situation where I can get mildly motivated to jump over for something that - to me - seems like a better morality without much punishment to me, I'll do it.


There are no universial morals. Anything - everything you think is evil some culture (possibly in history) thinks is good). I can't even think of something good that I'm confident everyone would agree is good.

there are some people (companies are run by people) that are so bad I boycott them. Most bad I treat like society cannot work without accepting them anyway.


There’s no possibility or need for morality to be universal, and societies have improved their ethics many times throughout history. Your take is nihilistic and presupposes that moral progress isn’t possible, even though we’ve seen objective moral progress many times.

Morals / ethics change of course. However that is not objective progress, only subjuctive. You think it is objective because you agree with the new system. Slave owners of the past would call it a regression that they can't live their lifestyle. Of course I agree with the new standards (at least here) and so am glad they can't.

edit: yes, nillistic - but sometimes you have to go there


Well, they did stand up to the US administration and lost a lot of money in the process. That takes courage. They clearly were being bullied into compliance, and they stood their ground.

You can see the significance of this is you look at German Nazi history. If more companies had stood up to the administration, the Nazi state would have been significantly harder to build.

In my opinion, what Anthropic did is not a small thing at all.


The comment I replied to said that they believed OpenAI would allow "AGI to be used for truly evil purposes".

By contrast Anthropic wouldn't? Yet Anthropics stance is only two narrow restrictions. As I said are those two things the only evil things possible?

If not, why is it that people on HN think Anthropic would not allow evil usage?

My hypothesis is a halo effect. We are so enthralled by Claudes performance that some struggle to rationally assess what Anthropic has actually done.

Yes it's no small thing to say no to the Trump administration but that does not mean they haven't said Yes to otherwise facilitated other evils.

In fact to me the statements from Anthropic seem to make clear they are okay with many evils.


> Yet Anthropics stance is only two narrow restrictions.

Really I think Anthropic should have a single restriction: to not assist with illegal or unconstitutional activities. If automated killings etc is illegal then it would be covered by that one rule.

I don't think Anthropic should be in the business of deciding what is "evil".


If each of us individually or as corporations should not be in the business of deciding what it "evil", who should be in that business?

Everyone SHOULD continuously consider, decide, and live by moral judgements and codes they internalize, and use to make choices in life.

This aspect of life should NEVER be outsourced — of course, learn from and use codes others have developed and lived by — but ALWAYS consider deeply how it works in your situation and life.

(And no, I do NOT mean use situational ethics, I mean each considering, choosing, and internalizing the codes by which they live).

So, yes, Anthropic and anyone else building products absolutely should be deciding for themselves what they will build, for what purposes it is fit to use, and telling others about those purposes. For products like AI, this absolutely includes deciding what is "evil" and preventing such uses.

If the customer finds such restrictions are not what they want, they ARE FREE to not use the product.


> If each of us individually or as corporations should not be in the business of deciding what it "evil", who should be in that business?

This is easy imo. Two methods:

1. The law. It should not be legal for the US Govt to murder people at will. If it is legal, then of course they'll use tools to make it easier. Maybe AI, maybe Clippy. If they can't use AI then they'll fall back to using some other way of doing it like they've already been doing for several years.

2. Voting. For representatives that actually represent us and have our interest in mind rather than their own corrupt interests. And voting with our wallet against companies that do legal but morally bankrupt things.

Of course we're failing both of these hard right now. But imo the answer is not to give up and let corporations make the rules.

In other words, if it were legal for a normal citizen to murder anyone they wanted, of course they'll use Google Maps to help them do that. We don't put restrictions on how people can use Google Maps. Instead we've made murder illegal. We should be doing the same thing here.


It's illegal to drive drunk or read your cell phone and hit strangers head on.

Nevertheless, it wasn't lawmakers, it was car makers who innovated to build-in airbags and seatbelts and lane assist and and and ... under the theory that though it's illegal, bad things are done anyway, and guardrails still matter.

Colloquialism: "belt and suspenders".

Many, like Volvo, go above and beyond the requirements to make their vehicles safer, and then having demonstrated these guardrails, some become law as well (even as other makers in the industry kick and scream about being forced to, and riders rebel against buckling up).

As we haven't solved this stand off for a century, we are unlikely to resolve it within the pace needed by expansion of AI. In this scenario, Anthropic is Volvo.


Exactly zero of those account for an individual's or company's ability tolive by their own moral code

And this AI software is not a mere static object like a hammer that can be handed off to a customer and what it is used for is their business, to build a house or bash a living skull.

This is a system that must be constantly maintained by it's builders.

Moreover, even if we use your standard, the law, it has already been decided in Anthropic's favor.

What you require is that Anthropic actively participate in activities that they consider abhorrent and/or unwise. SCOTUS has already ruled that a business cannot even be required to sell a cake to someone if it does not like the intended purpose (in that case, at a celebration at a gay wedding).


> even if we use your standard, the law, it has already been decided in Anthropic's favor.

I support Anthropic here. They had a deal with the Govt and the Govt bullied them. That should not be allowed, and Anthropic is suing which makes sense to me. Anthropic should be allowed to set any terms of use for the product that they want, and gain or lose business based on those terms. That's fine.

I'm saying that the failure is actually upstream. It should not be possible for Anthropic's AI to be used to mass surveile or murder people, because those things should be illegal by law and the govt should not be allowed to do it and should not be doing it. Somehow it isn't this way though.

So now that we find ourselves in this failed state, we have to rely on Anthropic to be "the law": to identify what what's "evil" and disallow it. I'm saying that's out of scope for a tech company and they shouldn't be expected to do that. They should only be in the business of making good tech and then be free to let it be used by anyone for any purpose that that the law allows.

This also means that if it's illegal to share information on how to build a bomb without AI, then it should be illegal for Claude to share that information with AI. So Anthropic to does need to make sure they're not breaking the law themselves as well.


Ah, good, we generally agree

For sure, Anthropic should NOT have been forced to decide the ethics of deploying their tech

Nevertheless, they should always be considering the ethics of their own creations and actions, and it seems they are — as soon as they got bullied by a failing regime, they had the right answer: 'no, that is not ethical and we won't allow it with our products'.

The problem is that the law only very roughly captures what is right and just, so there are many things that are legal that are unethical, at the same time there are many things that are ethical but illegal. So, we can't entirely outsource our personal or corporate ethics to the law.


It's not high. But it is higher.

We'll take anything we can right now. I agree.

Although we shouldn't let that mean we misjudge what we are actually getting.


As a rule when there are large companies and/or billionaires involved you are in for trouble.

Let's not forget they also lobby to forbid models from China and pretend that distillation is stealing. but somehow just because they said no to two points the majority of HN folks think them as virtuous.

I never understood the point of this kind of comment. It doesn't add any value or anything to the discussion. Its basically two paragraphs with some presupposition (openai bad) and how the author is virtuous by canceling his subscription. No explanation, argument, nuance. Its just virtue signaling. Actually... I guess I do know the point of this kind of comment. I just don't know why these kinds of comments get upvoted, even if you do agree openai bad

I tried Claude recently (after they dropped the nonsensical requirement to give them your phone number) and I was surprised to see how significantly less sycophant it was. Chatgpt, unless you are talking hard science, tends to be overly agreeable. Claude questions you a lot (you ask for x and it asks you stuff like: why are you interested in x, or based on our previous convo, x might not be suitable to you, or I see your point but based on our previous convo, y is better than x, etc). Chatgpt rarely does that.

Of course, also OpenAI being ran by openly questionable people while Dario so far doesn't seem nowhere near as bad even if none of them are angels.


Claude still doesn't have image generation?

Image generation isn't what most devs spend most of their time on?

It is semi-competent at making SVGs. Which are the only kind of images I really need in dev work.

For marketing or personal stuff I do sometimes want images, but I don't really mind going somewhere else for that


Interesting. Have been using Gemini, Gpt and Claude extensively in parallel and never noticed that.

I'm switching over to Claude from OpenAI, and I don't care. OpenAI's image generation is terrible anyway. Just try to get it to generate something to scale, like a cabinet for a specific kitchen or bathroom space. Give it all the explicit constraints, initial sketches, etc. it wants.

The results are laughably bad.

Sure, it does get some of the tones and features, but any kind of actual real-world constraint is so far off, and the dimension indicators it includes are hilarious if they weren't so bad.


I did the same thing and cancelled my OpenAI plan today. Besides boycotting it for their latest grifting I also found it to not really produce much value in my use cases.

Moving back to doing this archaic thing called using my own brain to do my work. Shocking.


Yes they have a great marketing team and a powerful astro turfing presence though, especially with the recent "Claude beat up OpenClaw! OpenAI is supporting the community by buying it!" and that nonsense.

Though tbh I hardly feel Claude is innocent either. When their safety engineer/leader left, I didn't see any statements from the Anthropic team not one addressing the legitimate points of his for why he left. Instead we got an eager over-push in the media cycle of "Anthropic standing up to DOD! Here's why you can trust us!"

It's all sounds too similar to propaganda and astroturfing to me.


[flagged]


OpenAI actually does have two excellent OSS models. Not Anthropic. Not that OpenAI is 'open' per se, but more so than Anthropic. Also see the Codex vs Claude Code extensibility.

They are far from excellent and they were open sourced due to the mounting pressure for calling themselves "Open" AI and not doing anything open. At the time, they also had Chinese competitors wiping the market value of many stocks (NVidia, etc.) after releasing true OSS models that performed as good as SOTA models and they had to retaliate. I don't know of anyone who uses those OSS models in production instead of Qwen series or DeepSeek.

I use them heavily in production because OSS 120B on Groq/Cerebras is one of the fastest models available with enough smarts for basic tasks

What is your definition of NPC behavior?

> Random guy on the internet posts links to cancel ChatGPT subscription

> Cancels subscription

> Random guy on the internet tells you to be outraged

> Gets outraged

I'm not even a fan of OpenAI generally speaking, but, this is just silly cancelling them for no reason. If not them, some other lab would have done it. Or worse, DoW would've forced them to.


> I swear HN is just a bunch of fanboys full of NPC behavior.

Why are you assuming these are real people and not NPCs?

The amount of money flowing around AI is staggering. To believe that the AI companies aren't flooding all the social media zones with propaganda is disingenuous.


> To believe that the AI companies aren't flooding all the social media zones with propaganda is disingenuous.

Touché


> To believe that the AI companies aren't flooding all the social media zones with propaganda is disingenuous.

You don't use "believe" with "disingenuous": it literally makes zero sense.

If people honestly believe that, they may be naive. Or they can be "disingenuous" if they're not being sincere. But if you just say what you believe, you're sincere (and maybe naive), and hence cannot possibly be disingenuous.


Hmmm. Good point. I probably knee jerk apply "disingenuous" to everybody AI-adjacent, but I should rethink that given the connotation of "false candor".

"Naive" is certainly a better choice.


That is the same categorical argument as what the story is about: scanned brains are not perceived as people so can be “tasked” without affording moral consideration. You are saying because we have LLMs, categorically not people, we would never enter the moral quandaries of using uploaded humans in that way since we can just use LLMs instead.

But… why are LLMs not worthy of any moral consideration? That question is a bit of a rabbit hole with a lot of motivated reasoning on either side of the argument, but the outcome is definitely not settled.

For me this story became even more relevant since the LLM revolution, because we could be making the exact mistake humanity made in the story.


And beyond the ethical points it makes (which I agree may or may not be relevant for LLMs - nobody can know for sure at this point), I find some of the details about how brain images are used in the story to have been very prescient of LLMs' uses and limitations.

E.g. it is mentioned that MMAcevedo performs better when told certain lies, predicting the "please help me write this, I have no fingers and can't do it myself" kinda system prompts people sometimes used in the GPT-4 days to squeeze a bit more performance out of the LLM.

The point about MMAcevedo's performance degrading the longer it has been booted up (due to exhaustion), mirroring LLMs getting "stupider" and making more mistakes the closer one gets to their context window limit.

And of course MMAcevedo's "base" model becoming less and less useful as the years go by and the world around it changes while it remains static, exactly analogous to LLMs being much worse at writing code that involves libraries which didn't yet exist when they were trained.


Other book recommendations:

- valuable humans in transit by qntm

- the old axolotl by Jacek Dukaj (skip the tv show, it is very different from the book)

- the Agent Cormac series by Neal Asher

- the night’s dawn trilogy by peter f hamilton (his books are borderline fantasy, but he writes such deliciously monstrous villains)


On mastodon, with the non-algorithmic feed, following mostly accounts that aren’t particularly political, those things are still at the top of the feed. If you’re not seeing those topics at the top of your feed you’re probably being misled by your algorithm.

Another reason why feed ranking algorithms should be published. If we can see the algorithm we can stop playing these yes/no games. The real enemies are social media companies, not the other side of politics.


Realistically to maintain a modern web framework to even a minimal standard you need a few people working fulltime on it, and more than a few if you want to take it places. There needs to be some kind of long term sustainable vehicle for funding those developers, either a corporate sponsor or a foundation.

So all those frameworks have to end up somewhere, and I’d rather it be somewhere else than vercel, as they already own way too much of the web frontend space.


that very much depends on your definition of "modern web framework".


You have to use third party software to configure them properly, then they work fine. I used logitech’s drivers for a while but they’ve become the biggest pile of garbage I have ever seen call itself a driver. I now use BetterMouse instead.


By the way, you can also do this pure CSS using the scripting media feature: https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/A...


I didn't know this existed. Thanks for sharing! Very useful api


I would suggest not using this as it's not compatible with NoScript


I have to disagree on trackpads sucking less. This year I walked into a big box electronics store and tried the screen, keyboard and trackpad on every laptop they had on display.

Trackpads were universally abysmal, with the sole exception of the macbooks. They all had the frustrating diveboard design, every single one at every price point from every manufacturer. I’m sure you can buy laptops with decent trackpads online, but they had none in the store, macbooks excepted.

Keyboards were all over the place, but I notice that even some premium models are now carrying generic low end keyboard parts with weak travel, lack of key separation, num lock mashed into the backspace, and awkward arrow key layout. If anything I think keyboards are getting worse.

Screens are the one place where I’ll say things have improved noticeably, especially colors and black levels, although getting over 200 ppi and 500 nits is still a rare treat, and that is my bar for a compromiseless display.


> walked into a big box electronics store

You're comparing Apple to unnamed computers brands you touched at a random place, I'm not sure what to make of it.

For instance how does the Macbook Air compare to the current 13" Surface Laptop ? Is that what you call diveboard design and awkward arrow key layout ?


I didn’t say good just less bad.

Apple obviously produces the only product incorporating a touchpad that applies any significant, deliberate thought about it.


500 nits is not really good enough for laptop that you might use outside.

Luckily they are still improving and we now have Tandem OLED with about double that.


Should a laptop be optimized for indoor or outdoor use?


Given the primary selling point of laptops is their portability (often at the cost of other things), they should be optimized to be highly usable wherever they might end up getting used.


I don’t feel confident there is a way to use npm safely. The basic problem is that of curation, and there not being any except for what you do yourself. Every day brings new surprises, and osv.dev’s npm feed is a continuous horror show.

I would love to see the equivalent of a linux distro, a curated set of packages and package versions that are known to be compatible and safe. If someone offered this as a paid product businesses would pay for it.


They'll use AI to do it and it will be no better than what we have today. You will pay for it anyway.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: