Well, regardless of technology, the space of things you can accomplish without risking your own troops' lives is very small. (Unless you're willing to go nuclear, which has the pesky downside of ending the world.)
To put it in perspective - in Vietnam, opposition forces lost over a million troops and continued to fight viciously. The US lost around 50,000 and gave up and left.
Democratic countries simply lack the stomach for this kind of thing (which is a good thing, really).
tbh I don't think this use case is going to be as big as people seem to think
there are a lot of reasons, but in brief - I think AI desktop use is a product that the average person isn't going to get much value out of. to make an analogy - the creators of Segway thought people would buy them in large numbers, but it turned out most people don't mind walking manually (or at least, don't mind it enough to spend money on a scooter). I think makers of AI Desktop Use products are going to find out the same thing as it relates to everyday tasks like checking email and shopping.
I was thinking more remotely managing the computer in a warehouse, replacing the mouse of an architect, or some physical object engineer. That your grandma can finally find Discord by speaking to such a bot is just a nice side effect.
well yeah I wasn't even talking about professional use, since I think in professional use cases it will turn out make a lot more sense to set up APIs that AIs, use, than to set up screen scraping and mouse+keyboard use.
in fact even in rare cases where it's not possible to get an API or CLI to interface with some piece of software, I think people will find that their best bet is to first create a deterministic screen-scraping program for that specific software, then have that program serve an API for the AI to use. it would be so much cheaper to run (inference-wise) and so much more reliable, than having the AI itself perform the image interpretation and clicking.
I see AI desktop use as mainly a consumer product for that reason, since that's the situation where you have to react "on the fly" to whatever the user asks you to do and whatever program happens to be on their computer (versus professional cases which are more large-scale and repetitive, and where you can have a software developer on hand).
people used to say this about search engines and web browsers, as well
regardless, eventually Google became the universal default for both. When it comes to software, the average person doesn't shop around for the technologically optimal choice, they just use what everyone else is using.
AI (that is, plain chat) is always going to be free to use as well. Google and Microsoft are going to keep it that way. And make the money back via ads.
That's why ChatGPT still has a free option. If they didn't, they would lose a billion users overnight to Gemini.
my point is today there is no clear winner. opus, gpt 5.4 and gemini have different strengths. google search was running circles around competition in basically all use cases.
Even if there were, it wouldn't be very useful. In a "common law" system like the US, legal questions are rarely answered purely by the plain text of a law. You also need case law - that is, how the law is typically applied in practice by the courts.
CourtListener (from the non-profit Free Law Project) provides bulk access to its case law collection.[1][2] I expect they are ingesting the GPO uscourts collection so it should have near complete (99%) coverage.[3][4]
OpenAI, Anthropic, and Google are way ahead of you. Ask your heart away. I am unsure about open source models, it'd be interesting to know if they're ingesting the law.
The reality is that the official United States Code gives plenty of history for statutes, while the Code of Federal Regulations gives less but still basic history. Both are also provided in XML in bulk, though the former has a modern USLM format and the latter has an archaic schema. Case law is less amenable to git histories.
I don't see this as a "modern" vs "back in the day" thing.
The real reason many software orgs nowadays don't have QA is for the simple reason that it's slow. Everything in the consumer tech space is about rapid growth, and moving as fast as possible. Nobody cares very much that the software has bugs, what matters is whether it has users.
But outside of consumer tech, QA is a lot more common, since it matters a lot more that the software's logic is correct. (Speaking personally - I used to work for a genetics lab, and we had QA.) There are just different economic incentives involved.
I mentioned in another comment that I make sure the cost/time is within 1.25x of the next best single-model run. So it's not perfect, but I think that aspect will only get better with time.
Of course I'm biased, but using Sup has been great for me personally. Even disregarding the HLE score, having many different perspectives in the answers, and most importantly the combined answer, has been very helpful in feedback for architectural decisions I make for Sup, and many other questions I would normally ask ChatGPT/Gemini/Claude/Grok individually.
There's plenty of results in math and physics that are true, in the sense that the math checks out, but are useless, in the sense that the authors claim they've made a breakthrough, but actually they've just tweaked a few parameters of an existing unverified theory and constructed a new unverified theory. (If you've ever read a news headline like "physicists now believe reality may actually have 400 dimensions!", they were probably citing one of these papers.)
There are also plenty of physics papers where, the math actually just doesn't check out at all. But those, at least, rarely make it into headlines or reputable journals.
I still don't see what the big deal is. If LLMs are (or become) all they're cracked up to be, it shouldn't matter whether someone "learns to use the tools" today or tomorrow or five years from now. In fact they should become much easier to use as they become more intelligent, you shouldn't need all these fancy prompting strategies anymore.
(Reminds me of search engines. People who really knew how to search for things honed that skill over a period of time, only for those skills to become irrelevant now that search engines are much smarter.)
I guess my point is - why must the technology insist upon itself? Evangelizing for people to use it when they don't want to, just sounds cultish. If it's useful, people will eventually use it - like any other new technology. If someone doesn't find it useful yet, maybe they just don't work in a field that AI is good at yet.
I agree with what you’ve said. If you want to see what is cultish, look at the subreddit the post is talking about. It’s denialism and not healthy behavior.
To put it in perspective - in Vietnam, opposition forces lost over a million troops and continued to fight viciously. The US lost around 50,000 and gave up and left.
Democratic countries simply lack the stomach for this kind of thing (which is a good thing, really).
reply