Hacker News .hnnew | past | comments | ask | show | jobs | submit | throwaw12's commentslogin

> How about we turn down the heat, everyone?

How about Anthropic turn down the heat and refunds money to everyone for every bug it created with its LLM?


can you share your use cases for 2b and 4b models?

curious how people are leveraging these models


For me, I use them for quick auto complete or small questions. I am not a vibe/agentic coder. I know I am a relic and a Luddite because of this.

Instead of hitting stack overflow and Google I will ask questions like "can you give me an example of how to do x in library y?" Or "this error is appearing what might be happening if I checked a b and c". Or "please write unit tests for this function". Or code auto complete.

I am not looking for the world's best answer from a 3b model. I am looking for a super fast answer that reminds me of things I already know or maybe just maybe gives me a fast idea to stub something while I focus on something more important, I am going to refactor anyways. Think a low quality rubber duck

I mostly use 7-9b models for this now but llama 3.2 3b is pretty decent for not hogging resources while say I have other compute heavy operations happening on a weak computer.

Probably half the questions people ask chatgpt could get roughly the same quality of answer with a small model in my opinion. You can't fully trust an LLM anyways so the difference between 60% and 70% accuracy isn't as much are marketing makes it sound like. That said the quality of a good 7-9b model is worth it compared to a 3b if your machine can run it. Furthermore the quality of qwen 36 is crazy and makes me wonder if I will ever need an AI provider again if the trend continues.


Over the weekend I used the small models for experimental training runs when figuring out how to build LoRAs. It takes a lot less time to do smoke tests of the process on E2B vs the 31B version. And E4B was a reasonable stop along the line just to make sure the LoRA combined with the base model to produce coherent output.

Also, they're good enough for a lot of simple categorization and data extraction tasks, e.g. something like "flag abusive posts/comments", or "visit website, find the contact info, open hours, address". And they run fast on the kind of hardware you're likely to have at home, while the bigger dense versions decidedly do not.

I used Gemma 4 itself to review and prune the data (my social media posts over the last ~5 years, about 5 million words) being ingested into the training process for a LoRA for Gemma 4. I found the bigger model (31B) was more nuanced and useful than the smaller ones, and I wasn't in a big hurry by that stage of the process, so I used the big one overnight. Gemma 4 31B was also a better judge of my writing than Gemini Flash 2.5, by my reckoning.

It was, again, more nuanced, and was able to recognize a generally helpful comment that opened kinda jokey/rude, while the smaller model and Gemini 2.5 Flash tended to gravitate toward extremes (1 or 5) rather than the 1-5 scale they were prompted to rate on. I assume Gemini 3.1 Flash is probably competitive or better, but I didn't try it, since I liked the results the self-hosted Gemma 4 was giving for free.

The little ones also run great on very modest hardware. Both run at comfortable interactive speed mid-range tablets. E4B is blazing fast on an iPad M4 or Pixel 10 Pro and entirely usable on a midrange Android with sufficient RAM.


I feel like you didn't understand the goal of this study

> The DTN-UK stated earlier this year that generic LLMs must never be used as autonomous advisory calculators for insulin delivery. This data is the quantitative evidence base for that statement.

This study is to prove that you should not rely on LLMs


The thing is it doesn't really prove LLMs can't do this, it proves no existing frontier LLMs can do this.

The part where they talk about sampling multiple runs is interesting - it suggests to me that in the next few years as the reasoning process is improved the models may be able to do that autonomously.

My mind really is going to using a dedicated object detection models fine-tuned with nutrition information, but I don't think there's a fundamental reason LLMs can't eventually manage this use case, except perhaps the size of the needed weights being prohibitively large.


Per some people, LLMs of the future can do literally anything that's possible to do. They could create quantum computers powered by fusion power.

That has nothing to do with the question being asked, can you rely on an LLM today to help you track carbs as a diabetic?

This is very explicitly what the article is all about. Potential future LLMs are entirely irrelevant.


This isn't something so fanciful as fusion power, this is reasonably something that might be within the capabilities of object detection transformers. Whether a different prompt/finetuning with a good dataset could make this work is very relevant here.

that is good to know. presented this way i find LLM behavior to be a feature, not a bug. then again i think everything is value add over pen and paper / notepad / spreadsheet and maybe a friend or doctor (or specialized equipment if you need more than calorie in, calorie out). just go exercise and don't be a lard lad

> just go exercise and don't be a lard lad

You can out-exercise almost any diet, but it takes 3-4 hours a day of a hard workout.

If calories in, calories out was useful advice rather than a banal statement of physics, nobody would be fat.


> If calories in, calories out was useful advice rather than a banal statement of physics

it's also wrong, or at least imprecise. fat and bone and muscle all weigh different amounts, at the same volume.

the only way i've been able to explain the science to laypersons is thus:

if you and a friend both weigh 200lbs, but you once weighed 250lbs and your friend has never weighed more than 200lbs; all else equal: you must ingest less calories than your friend to maintain 200lb body weight.

your body will try to "outlast the famine" if you had to lose a lot of weight (or lost a lot of weight for any reason).

That absolutely does not comport with "calories in, calories out". It's also why the people who were never fat have no problem "just eating a donut."

no, i won't cite, this has been published many times in the last decade, 13 years.


Eh, it totally does comport with calories in, calories out. We just don't hook people up to metabolic carts in day-to-day life, but that's really the only way to measure the calories out part.

The physics of CICO are undeniable. Humans don't photosynthesize. The biology of CICO is useless.

- former fat guy. I'm deeply, personally aware of how useless CICO is as dietary advice.


go run 30 minutes a day and get back to me

i prefaced for those who have legit medical issues

more people adopt your "banal" viewpoint and are just lazy

c-r vibe coding


Nope, been there, done that, the only time I out-exercised a diet was 3-4 hours of American football a day, 5 days a week.

> just go exercise and don't be a lard lad

This is about people suffering from diabetes tracking their insulin needs. You can outrun any diet, but not insulin shots.


The paper itself is a lot clearer about the purpose. The blog post reads very clickbaity and doesn't really explain the context well.

I disagree, it clearly explains that AI carb counting apps are a problem and shouldn’t be used.

They’re writing in a neutral way that reaches their audience without lecturing or being condescending. They lead the reader to the conclusion rather than shoving it at them. I think that’s why it’s triggering so many angry comments on HN, but it’s effective for the audience they’re writing for (non technical people who may need convincing but don’t like being preached at)


But it's stupid. If i smack myself in the head with a hammer is that proof hammers shouldn't be relied on?

If you smack yourself in the head with a hammer and it injures you that's evidence that smacking people in the head with hammers is bad and shouldn't be done, right?

Are there start-ups led by idiots suggesting that smacking yourself in the head with a hammer will help treat your diabetes?

If not, then perhaps there's a problem in your analogy.


Here we’re at the origin of the tool and get to watch how many people hit themselves in the head before we learn this collective wisdom.

There’s a gap between what the tool will allow you to tell it to do, and what it’s good at. The feedback mechanism to tell the difference is deficient compared to a hammer.


No, but it would be proof you didn't get the point of the paper.

Not sure if this is competitive, look at the numbers for Qwen3.6

Has anyone tried these models?

I like their honesty in benchmarks, looks like Qwen3.6 35B is outperforming their Laguna M.1 225B model


"Principles" of "Open" AI

* This will change anytime we want, whether you agree or not

* For employees who are following current "principles", when we change, if you are strict about principles, then please leave, we will hire new people


You are missing the timeline factor here.

2016 - lets use EC2, its just VM, we can move off

2018 - I see you are hosting your own PostgreSQL in EC2, you can use our managed solution

2020 - you are already using 18 our services (note, at this point you might still be using non-vendor products, like VMs, managed DB, and so on), why not use our IAM instead of rolling out your own auth.

2024 - you are now deeply locked, lets add more lock-in, why don't you use this tool to optimize your costs (welcome DynamoDB)

At this point, no one would ever question next tool from salesman. Because engineers see that company doesnt have strategy to move to another cloud, why should they reject this new tool?

also consider the people who are involved, a lot of times after 2 years you have totally new people in your team, they won't have context and constraints you had in the past when deciding to buy "just VM", they see it as "we already use AWS"


That. Also add the parts that engineers are terrified to their bones of moving elsewhere because they don't know how to use anything else and will act as extension of the salesman to make sure they don't need to learn anything.

2025 - You start using Amazon S3 with pre-signed URLs and serve those directly to your customers. Great, now your customers are also locked into AWS.

This shows Western government system is broken.

In ideal world (where we don't live):

* Corporation - optimizes for mid-to-short term profits (remove slack, run everything thin)

* Government - optimizes for long term profits (introduce regulations to keep the slack time, keep and attract the talent so state gets better)

* Individual - optimizes for their life time (career, family and tries to leverage market conditions to learn skills and get more opportunities from existing pool)

In the west, government is optimizing for "loads and loads of moooney", because of lobby groups and MBAs controlling the corporations which are pushing these ideas through lobbies


> In the west, government is optimizing for "loads and loads of moooney"

More appropriately, government is optimizing for 4 year electoral terms. No one cares about longer timescales necessary to tackle hard problems.

This is where autocracies like China, or monarchies for example, win over democracies.


Counter-examples are France and Japan. Democracies, electoral terms. High-speed rail that the world looks up to, investment in infrastructure everywhere. In France you have Grand Paris, a programme to transform the suburbs into denser housing and commercial space, a calculation and planning that INCLUDES public transport.

And the green initiatives in France. These, transit, Grand Paris, and much more are initiatives that take many years to realize.

Now let's move over to New Jersey and New York City. The most densely populated state (NJ) has some of the worst transit despite being in the NYC greater metropolitan area. An old tunnel between the two needs to be replaced, but politicians with four year mental horizons canned it until recently (ARC project). Infrastructure is a fight between Federal, two states and a city politically and partially from a funding perspective.

We could go on, but I just wanted to point out that the United States is a poor example of good governance. And that we don't need to live in a totalitarian nightmare just because we acknowledge the US fails to produce innovation and investment for the public good.

And let's not talk about debt, as if it is a unique problem to France or anything new.


Counter-examples are France and Japan. Democracies, electoral terms.

A democracy doesn't prevent long-term planning if only the electorate appreciates long-term projects. Democracies can build stuff across parliaments if differences between parties aren't so overwhelming so as to the majority of them can agree on developing something even if the relative power of elected parties varies over terms.

In a lot of countries the major parties agree on many core views and social code because they share a common nation/society, and the political differences happen merely on the edges or linings of the value spectrum. A government of highly polarised parties voted by a highly heterogeneous pool of voters is not ideal for long-term efforts.


Strong words, but NJ and NY have had Democrats in power continuously for a while. While Christie, a Republican (and a corrupt person) torpedoed the ARC project I mentioned, the other issues seem more related to a lack of centralization.

Something like public transit that powers a region that's a sizable fraction of US GPD should not be a state affair in the first place. What Europeans have that Americans lack are perma-technocrats with a long term vision that are entrenched in power.


>This is where autocracies like China, or monarchies for example, win over democracies.

Autocracies like China, are able to plan longer term. But, because they don't regularly change their leadership like a democracy, the leaders become old, tired, schlerotic and surrounded by 'yes men'. Hence "Democracy is the worst form of government, except for all the others.".


Western democracy is very interesting.

Corporations promote people to Principal or distinguished engineer only when they prove their worth by running long running large scale projects.

But when it comes to governing the whole country: lobby, marketing and boom, you are a president for next 4 years, which is anyway not enough to deliver anything big and see the impact. (Except the destruction, destruction is easy to cause)


I think that has something to do with the prerequisites of democracy.

I believe one important factor for a democracy to work properly, is to have a large number of citizens who 1) can stand up and push back when they feel something is wrong, and 2) is sufficiently knowledgeable. We don’t have that anymore. Of course I’m also to be blamed for that.


Democracy requires informed thoughtful voters to function.

Public education was supposed to deliver that. This is a dream that has failed in the US.

Possibly the most lacking tools are Critical Thinking (not directly taught as a subject AFAIK) and some class with a focus on how government(s) work. The latter was an elective I took in high school (not a core requirement, it should be).

At least when I was in college it helped to have critical thinking skills, but was not a basics (100 level) course. Political studies might be a different degree, but again not a core course. I find that ironic since everyone has to interact with government regulations and vote.


I think of the four year cycle as one year to whine about the previous (if different) government you took over from, two years of governing and the last as a ”get ready for election”. So in the most optimal scenario you get three ”peaceful” years. It’s very few things that can be done well in three years at ”ruling a country”-scale.

I wonder what longer cycles with easier recall methods might yield.

I dunno if cycle length is the key here, the Soviets and the Chinese went with five-year plans, and done properly, it seems like thats a long enough amount of time to accomplish very important things.

WW2 took slightly less than 6 years, when we count it from the invasion of Poland to the fall of Nazi Germany.

The moon landings took little less than 7 years, so I don't think we are terribly off by the timeframe.

Considering the world's been getting faster (just think about how different the US was before Trump took power a bit more than a year ago), I think 4 years is fine.


It's also where autocracies fail spectacularly and lead to decades of misery for their citizens.

> This is where autocracies like China, or monarchies for example, win over democracies.

This is the wrong characterization, and in fact it's where monarchies lost out to democracies. Without an organized system of replacement in response to poor performance, autocracies with a poor leader are stuck with that poor leader for life. Ask North Korea how that's going. The upside is that if you have a brilliant leader, then you also get the benefit of that brilliant leader for life. The variance in an autocracy is absolutely huge, and that's their weakness in the long term. Democracies take the edge off, and are intentionally designed to have both less upside and less downside, trading performance for stability. Xi Jinping looks good comparatively because we have gormless losers like Trump and Biden to compare to him to, but he makes plenty of his own mistakes as well (the whole Taiwan situation is a unforced error driven by his own ego, similar to Putin with Ukraine), and we've seen historically what China looks like when it's stuck with a shit leader for decades (Great Leap Forward, anyone?).


I always think that’s the failure of citizens, not just the officials. Eventually history is going to blame us for not taking action, not pushing back, and pretty much sleep tightly when things fall apart around us.

> The problem is a management pattern .... Short-term cost cutting

Absolutely agree with this. Most MBAs are taught to optimize and reduce the slack.

It works fine with machinery and materials, but not with humans.

When machinery is optimized and run thin, when one of them breaks, you can get exact same in couple days (you usually prepare for it earlier), but with humans, they train their brain and next person is different from the first person.

Humans also break in different ways:

* They stop caring - you wouldn't notice it immediately, they will close tickets, but give bare minimum thought

* Communal brain will not be trained when there is not enough room for experiments and learning - which reduces the innovation eventually

This is exactly the reason it is difficult for US companies to compete with Chinese companies in manufacturing, because their communal brain have already trained and produced very good talent.

Next is the knowledge, more you outsource, more you lose it


Perhaps US companies should invest more in their employees then? Advancement, promotions beyond %1-3% COLAs, career paths, etc would go along way to keep employees interested in seeing their employers succeed instead of jumping ship every couple of years. The would require some effort from the C-suite however and since they jump ship every few years as well, I don't see that changing anytime soon.

Unfortunately the Wall Street accountants who run our companies don't mind if you jump ship after your 2% 'reward' raise. Because when someone new comes and costs 10 % more plus recruiting costs, that latter person has 'proven' their worth in the market, similar to when a house goes up in value due to scarcity.

If you were to explain the costs of knowledge lost, of training, of taking a risk on a new unknown person, of relationships, there's no answer because it doesn't show up in any operating expense worksheet.

What you're supposed to do is find another job, and explain that you love this job so much, but the other offer is really good, can they come up close to it and you'll stay. Repeat this every few years or find a new job and move to it.


Invest in employees is very broad statement.

Before investing to employees I think it should revisit management practices and strategies, which starts in MBA and university.

Instead of teaching how to increase shareholder value in the short term, it should also teach how to increase value to the society in the long term as well (and focus on it highly) - not just say: if you win society wins kind of generic fluff.

Without changing management strategies everything becomes short term after a while


> teaching how to increase shareholder value in the short term

Problem is this is quite often a requirement for a public company, to maximize company value/profits above all else.

Also execs who are more right-wing are typically not interested in helping the larger society in general.


This is BS. A shareholder lawsuit against the CEO/board/executives for investing in the employees in the hope of long term profits would never succeed. The idea of a fiduciary duty doesn't mean that. It means the CEO can't take actions that intentionally hurt the company.

And there are very few large public companies with active enough investors to oust a CEO over this, and even fewer that have both active and activist investors that would be interested in such a thing.


There can be other valid perspectives than your own, especially when one makes sweeping black-and-white generalizations without any evidence.

> the CEO can't take actions that intentionally hurt the company

And who gets to define what "hurt" means?


The government.

You'd have to prove that the CEO knew that it would hurt the company and still did it or that it's so ridiculously negligent that any sane person should have known.


> Most MBAs are taught to optimize and reduce the slack.

With a myopic definition of "optimize". But as long as they are being rewarded for it, the incentives are broken.


Sure, then hopefully makes them stronger

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: