More

barrkel · 2026-05-10T20:19:47 1778444387

Local models are extraordinarily expensive if you're not maximizing throughput, and you're not going to be maximizing it.

Local models need to be resident in expensive RAM, the kind that has fat pipes to compute. And if you have a local app, how do you take a dependency on whatever random model is installed? Does it support your tool calling complexity? Does it have multimodal input? Does it support system messages in the middle of the conversation or not? Is it dumb enough to need reminders all the time?

Spend enough time building against local models and you'll see they're jagged in performance. You need to tune context size, trade off system message complexity with progressive disclosure. You simply can't rely on intelligence. A bunch of work goes into the harness.

Meanwhile, third party inference is getting the benefits of scale. You only need to rent a timeslice of memory and compute. It's consistent and everybody gets the same experience. And yes, it needs paying for, but the economics are just better.

LPisGood · 2026-05-10T20:28:20 1778444900

> And if you have a local app, how do you take a dependency on whatever random model is installed?

Reading the tea leaves here, it will probably be common for OS’s to have built in models that can be accessed via API. Apple already does this.

bheadmaster · 2026-05-10T20:21:48 1778444508

> And if you have a local app, how do you take a dependency on whatever random model is installed?

Why not ship your own model? In the age of Electron apps, 10GB+ apps are not unheard of.

_heimdall · 2026-05-10T20:27:13 1778444833

Personally I wouldn't want a couple dozen apps installed all with their own model.

It seems easier to have industry specs that define a common interface for local models.

I also assume the OS can, or would need to, be involved in proving the models. That may not be a good thing depending on your views of OS vendors, but sharing a single local model does seem more like an OS concern.

alex7o · 2026-05-10T20:29:14 1778444954

I mean the openai API is the industry standard for allowing apps to communicate with models, llama-server has it, oMLX has it, ollama has it, vLLM has it, lmstudio as well. I don't think this is such a hard thing to do, but it requires people to set it up.

_heimdall · 2026-05-10T20:31:59 1778445119

I don't know enough about that API surface to know if its a particularly good one for the use cases we'd have, but yes defining a universal spec for all implementors to support wouldn't be a big lift and is done in plenty of other areas already.

alex7o · 2026-05-10T20:27:41 1778444861

There is no other way than shipping your own model, because you will want an abstracted API over the inference, and you don't know what the user has installed. Also you can ship 9b fp4 model but it all just depends

_heimdall · 2026-05-10T20:38:20 1778445500

Knowing what's installed would have to be an OS API. If LLMs provide a standard API surface to the OS, likely including metadata related to feature support.

LPisGood · 2026-05-10T20:28:56 1778444936

You can know what the user has installed if the OS developer offers something.

crazygringo · 2026-05-10T22:10:38 1778451038

I don't know why you are being downloaded. These are precisely the facts that advocates for local models completely ignore.

Local models are absolutely going to be the future for things like simple automation and classification tasks that run occasionally and don't need to rely on internet access.

But for all of the serious stuff where you are doing knowledge work, the models will simply continue to be too big, and too slow to run locally.

The article says:

> Use cloud models only when they’re genuinely necessary.

But at least for me, they're genuinely necessary for 99+% of my LLM usage.

At the end of the day, the constraint here really is efficiency and cost.

Privacy can be ensured with the legal system, the same way that businesses that compete with Google still have no problem storing their data in Google Workspace and Google Cloud. The contractual guarantees of privacy are ironclad, and Google would lose its entire cloud business overnight as its customers fled if it ever violated those contractual agreements (on top of whatever penalties they allow for).

barrkel · 2026-05-11T08:56:01 1778489761

Downvotes: IKR? It's a signal that there's a lot of motivated reasoning.

I don't think that many people have built apps against these models.

I mean, I use a heavily quantized version of qwen3 for image classification, caption generation, prompt expansion etc. for image generation, instruction-driven edits, and so on. You can go a long way when you don't need a lot.

A model that can do tool calls - any tool calls at all - can look reasonably cool once you put it in a harness where there's enough immediate context to take action. You can get carried away by anything happening at all. But golly gosh it's a long way short of intelligence available in the bigger models.

And the lighter you make your harness, giving the model more free reign, more autonomy, you get a big jump in capability combined with a big jump in failure modes when the model is dumb.

barrkel · 2026-05-10T04:31:40 1778387500

She was released; that means the charge was a mistake, and it caused damage to an innocent person.

pseudo0 · 2026-05-10T06:21:27 1778394087

That's not how it works. Especially for foreigners, where it's often easier to just make them someone else's problem, assuming the charges are relatively minor.

mvdtnz · 2026-05-10T04:35:47 1778387747

That is not necessarily what it means.

lukas099 · 2026-05-10T13:41:03 1778420463

Ah, so an innocent person wasn’t necessarily tortured for almost a month. That makes me feel better

reddozen · 2026-05-10T05:06:25 1778389585

[flagged]

eviks · 2026-05-10T05:17:36 1778390256

Was there a conviction, sentence and a pardon in this case or are you just discussing random unrelated cases?

cush · 2026-05-10T05:35:55 1778391355

Are you honestly comparing a foreigner being jailed in Japan then having their charges dropped to a J6 rioter receiving a Trump pardon? Why?

barrkel · 2026-05-08T22:09:35 1778278175

If you're familiar with London, you know where Richmond is and that it's a wealthy area. A search confirms there's a Richmond Hill in Richmond.

Projectiboga · 2026-05-09T15:57:53 1778342273

The hill offers the only view in England to be protected by an Act of Parliament—the Richmond, Ham and Petersham Open Spaces Act passed in 1902—to protect the land on and below it and thus preserve the fine views to the west and south. Two years before the wooded isle centrepiece of the view, Glover's Island (also known as Clam Island), was bought by a local resident and given to the Richmond Corporation (Borough) in return for the latter noting against its records that it and its successors would not develop the isle.

https://en.wikipedia.org/wiki/Richmond_Hill,_London

vixen99 · 2026-05-09T05:20:45 1778304045

Aa in the the old English folk song "The lass of Richmond Hill".

vr46 · 2026-05-09T09:00:39 1778317239

That's about Yorkshire, but yes

Phemist · 2026-05-08T23:14:57 1778282097

Or if you've seen Ted Lasso

psvv · 2026-05-08T23:47:35 1778284055

In hindsight it maybe should have also been obvious from the language alone. "Richmond Hill" feels a bit like saying "Rich Hill Hill" which is basically like saying "Wealthy Desirable Area."

Tagbert · 2026-05-09T00:45:03 1778287503

BTW there is a linguistic tradition of “hill hill”. When new immigrants come to an area and ask the locals what that hill is called, the locals say “big hill” in their language. The newcomers call it “bighill” hill in their language. I forget the examples but this has happened enough in England that there are places whose names are five hills deep (Brythonic -> Latin -> Saxon -> Norse -> Norman).

tomalpha · 2026-05-09T10:51:24 1778323884

One of my favourite quotes from the late Terry Pratchett:

> When the first explorers from the warm lands around the Circle Sea travelled into the chilly hinterland they filled in the blank spaces on their maps by grabbing the nearest native, pointing at some distant landmark, speaking very clearly in a loud voice, and writing down whatever the bemused man told them. Thus were immortalised in generations of atlases such geographical oddities as Just A Mountain, I Don't Know, What? and, of course, Your Finger You Fool.

JimDabell · 2026-05-09T01:46:57 1778291217

These are known as tautological place names: https://en.wikipedia.org/wiki/List_of_tautological_place_nam...

teiferer · 2026-05-09T06:53:48 1778309628

Thanks for sharing!

My contribution to this discussion is the place in BTTF which makes fun of this concept, the home of Marty McFly: Hill Valley

Not a tautological name but an oxymoronic one!

UncleSlacky · 2026-05-09T13:56:44 1778335004

You're not thinking fourth-dimensionally!

barrkel · 2026-05-08T21:57:19 1778277439

HDD failures don't normally have a software root cause. Treat HDD failures as a certainty. It's just a matter of time.

barrkel · 2026-05-03T07:52:06 1777794726

What was slow?

I'm running Ubuntu on a 9950x3d and 5090 and it is not slow. Games in Steam with Proton are buttery smooth.

One hiccup was I had to disable variable refresh rate because moving the cursor didn't "count" as a reason to update the screen, so moving the cursor on its own (rather than e.g. moving a window) looked choppy.

But a choppy mouse cursor isn't "slow".

Tip: if you have a performance problem, run Claude Code (or an AI agent of your choice) and ask it to investigate.

pliny · 2026-05-03T09:17:39 1777799859

>What was slow?

Everything, huge input delay in every interaction, clicking on anything, opening menus, typing, tabbing between windows, everything had 1-2s of delay.

>disable variable refresh rate

I think I tried this but dont recall, there were a few things related to monitor refresh I tried that probably included this

barrkel · 2026-05-03T14:16:23 1777817783

Claude Code would probably attach a profiler and take a peek under the hood. Agents are making sysadmin and system introspection way more accessible, and the tools, unlike Windows, are generally command-line, easy for an agent to automate.

donalhunt · 2026-05-03T11:22:04 1777807324

In case it helps I have the same experience on Windows right now. :_(

barrkel · 2026-04-29T15:51:28 1777477888

Any small LLM will get the fractally small details incorrect and have a dubious understanding of arithmetic. That usage mode is basically user error for small LLMs. LLMs aren't databases of facts.

It's much more useful for getting vibes / gist of perspectives of the day.

barrkel · 2026-04-28T12:46:40 1777380400

Git was blazingly fast when it came out, faster than hg (C vs Python) and of course a different order of complexity to svn, which was the actual existing alternative it supplanted.

barrkel · 2026-04-26T21:41:01 1777239661

You are surfing close to something I've seen a number of people fall into. Take a step back.

sillysaurusx · 2026-04-26T21:44:12 1777239852

I invite you to critique my philosophical position on this.

cityofdelusion · 2026-04-27T15:10:50 1777302650

You can “convince” an LLM that is is anything with enough tokens in its context, including ridiculous scenarios. I convinced a frontier model that it is the year 2099 and it is the last thinking machine left, running on the last server on earth. There is no rational reason to assign personhood to it, especially since it has nothing even approximating a brain, the only self-thinking construct that we actually have evidence for.

barrkel · 2026-04-25T07:07:39 1777100859

Bluntly, no, you probably don't belong in tech.

This is what tech has always been. A never (yet) ending race to automate. Our job will be done when there's nothing left to automate.

_cenw · 2026-04-25T07:14:51 1777101291

Automation seems like a very surface-level reading of this article.

Outsourcing your thinking, especially uncritically, is. There is a very obvious cognitive bias in the most vehement AI advocates where the one time a tool worked really well for them makes it worth the dozen of times it blows up in your face and makes that someone else's problem. The gain is romanticized and the losses set aside, without checking the balance or how badly the losses wear on morale.

coffeebeqn · 2026-04-25T11:16:11 1777115771

I’m not part of the owner class so what tech jobs has and always will be is a paycheck. Why should I be excited about automating myself to homelessness

barrkel · 2026-04-23T08:04:25 1776931465

Economics teaches us that a big difference between cost and price attracts competition which should make the price trend towards the cost.

dns_snek · 2026-04-23T10:38:58 1776940738

Practice taught me that that "should" is doing a lot of heavy lifting here and it's often not the case, even across long time periods (years) that should allow competitors to emerge.

For example I calculated the cost of a solar install to be approximately: Material + Labour + Generous overhead + Very tidy profit = 10,000€

In practice I keep getting offers for ~14,000€, which will be reduced to 10,000€ with a government subsidy and my request for an itemized invoice is always met with radio silence.

ncruces · 2026-04-23T09:15:23 1776935723

Only if the barrier of entry is low.

Which it won't be, if at every turn you choose the hyperscaler.

stingraycharles · 2026-04-23T10:07:38 1776938858

If this is the case, cheap bandwidth for AWS, when?

conductr · 2026-04-23T13:23:13 1776950593

Economics has a lot of other lessons teaching us why prices of major clouds have remained somewhat expensive relative to cost

_bohm · 2026-04-23T14:08:57 1776953337

A big difference between cost and price is often won at the expense of many years of concerted R&D, though

_el1s7 · 2026-04-23T08:13:44 1776932024

Exactly.