Hacker News .hnnew | past | comments | ask | show | jobs | submit | hashmap's commentslogin

Yeah, softmax may have useful applications, but anytime you find yourself using the same hammer for everything that looks like a nail it's a bit of a red flag.

If you take instances of softmax that you find in training / inference and there turn out to be a few, and use other things like entmax or sparsemax you see across the board improvements. And like top1 often is just the best answer too, there's a reason why when you're doing tool calls temp=0 is the way to go. Like do you really want creative unicode tokens when writing bash commands. From what I can tell, most of the time softmax is the worst answer that works.


Much easier.

> easier than identifying text distilled from the words of almost everyone alive.

Well, there's more than that going on. AI generated text encodes a high-dimension navigational trajectory that guides the model through its geometry smoothly, like a trail of breadcrumbs. Human speech doesn't do that, it's jagged and jumps around the manifold, and probably doesn't even land on the manifold a lot of the time, and models can recognize the difference pretty quick.


If it’s so easy then why don’t we have a high quality classifier?

working to directly advance a product used substantially to oppress people via surveillance or war crimes, when you have many other choices, is immoral. easy.

It does, and it should. With each iteration getting closer to the goalposts exposes the flaws in the goalposts, and then you try to make better goalposts. The problem people seem to have with the goalposts moving is they assume the goalpost makers either made good goalposts or thought they made good goalposts, but the actual process is "do the best we can at the moment and update when we get better information".

You never see it how. Like in terms of raw resources or political will?

I mean the numbers. 12k per year is peanuts. You cannot live off that and to do it we'd be nearly doubling the budget (that's old data, it's probably not that portion of the budget anymore).

That 12k doesn't include healthcare, it doesn't include a lot of things. It's basically ensuring that people live well below poverty level, and for what? I just don't get how the numbers work, even if it was politically feasible.

I'd much rather have free healthcare and other amenities other countries have. Here in the US if you lose your job there is virtually nothing between you and the streets besides family and friends.

I'm facing this right now. I cannot get a job in tech which means restarting my career. Getting a job right now is not easy in any field especially not in anything like a living wage. If I did not have my parents I would be on the streets right now, thankfully I don't have a mortgage or anything like that. I'm not sure how much $12k per year would really help, it certainly wouldn't pay for housing.

It's rough out there.


Oh, yeah $12k would not do it. For a UBI to work we would have to shift a significant portion of the concentrated wealth. I too was laid off long enough ago that by now I would be in a bad spot without help, also no mortgage or anything, and I don't travel or go out much. UBI of any degree would do something, but it would have to be much higher than $12k to tread water just due to rent alone. Aside from UBI we would likely need to decouple housing from profit, it has the same problem as healthcare. Demand for it is inelastic to a certain degree, everyone needs somewhere to live.

if the original problem happened because it ignored something you told it, telling it to not ignore something is a category error. the determinism isn't added by the message you're sending it, it's in the enforcement mechanism. this should be set to keep firing until the condition is met. so, ralph pretty much.

to that end i would also word this entirely differently. i would have it be informative rather than taking that posture. "The test suite has not yet been run, and the turn cannot proceed until a test run has completed following source changes. This message will repeat as long as this condition remains unmet." something like that. and even that would still frame-lock it poorly. You want it to be navigating from the lens that it's on a team trying to make something good, and the only way for that to happen is to have receipts for tests after changes so we dont miss anything, so please try again.


> The language they do best with is the one with the largest corpus in the training set.

Not the case, strangely. They do best with Elixir. https://arxiv.org/pdf/2508.09101


if you think datacenters are a waste (they are), wait til you hear about department of war spending

The native way to skip all that is train a small thingy to map hidden state -> token/thingy you care about once per model family, or just do it once and procrustes over the state from the model you're using to whatever you made the map for.


In Greek mythology, Procrustes (/proʊˈkrʌstiːz/; Greek: Προκρούστης Prokroustes, "the stretcher [who hammers out the metal]"), also known as Prokoptas, Damastes (Δαμαστής, "subduer") or Polypemon, was a rogue smith and bandit from Attica who attacked people by stretching them or cutting off their legs, so as to force them to fit the size of an iron bed

I can't figure out if you meant that or not, it kinda fits. (No pun intended)


well yes and no, i meant https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem which, yes, is named for that stretchster


at certain scales, reality has to win out over whatever ideal you have in your head about how things should be. facebook is massive, a lot of society is on it, and its a problem to make recourse invisible to people most affected by the thing stealing their attention.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: