Hacker News .hnnew | past | comments | ask | show | jobs | submit | pjs_'s commentslogin

Be careful about how you interpret that paper. It looks really impressive -- real neurons in a petri dish seem to successfully (if amateurishly) murk a few imps.

https://www.youtube.com/watch?v=yRV8fSw6HaE

But there's more to the setup than you might assume from a casual reading. Here's the code used for that demo:

https://github.com/SeanCole02/doom-neuron

So there is an entire pytorch stack wrapped around the mysterious little blob of neurons -- they aren't just wired straight into WASD. There is a conventional convnet-based encoder, running on a GPU, in the critical path. The README tries to argue that the "neurons are doing the learning" but to my dilettante, critical eye it really looks as though there is a hell of a lot of learning happening in the convnet also.

Are the neurons learning to play doom, or are they learning to inject ever so slightly more effective noise into the critical path? Would this work just as well if we replaced the neurons with some other non-markovian sludge? The authors do ablation experiments to try to get to the bottom of this but I can't really tell how compelling the results are (due to my own ignorance/stupidity of course)


All opinions are my own:

The whole point of the CNNs is to act like a auto encoder for input and an auto decoder for output. The only reason why this is done in the first place is because the number of electrodes in the dish is pitiful and has no chance of describing something as complex as Doom. They are there to create a latent space that can be fed through 60 odd electrodes and decode the neuron latent space into pressing buttons.

The pong version of the game was the proof of concept that neurons can learn without a latent space intermediate in either direction. Both the world state and neuronal control were raw signals: https://pubmed.ncbi.nlm.nih.gov/36228614/

What I wanted to do after dish brain pong, but never had the budget for, was using live animals as the computational substrate. Use the visual cortex of one as the input, send the neural spikes to a second animals frontal lobe for computation and finally send those signals to a third animals motor cortex to physically press buttons. It's a shame we never raised enough because it wouldn't have cost more than $15m to build the hardware and do the biological proof of concept.


> using live animals as the computational substrate. Use the visual cortex of one as the input, send the neural spikes to a second animals frontal lobe for computation and finally send those signals to a third animals motor cortex to physically press buttons.

That sounds terrifying.


It does but most of what we do to animals is terrifying. I could see why getting funding for this idea might not have been that easy though "I want to mind control three animals to play Doom" is certainly a pitch

That is the fallacy of relative privation. The fact that most of what we do to animals is terrifying should be the motivating factor to NOT do more of it, such as the atrocity described above.

Seems like the relative privation fallacy is more and more a key component of an accelerating race to the bottom, across wide swathes of society.

Sure would be cool if more people could recognise it.


> The only reason why this is done in the first place is because the number of electrodes in the dish is pitiful and has no chance of describing something as complex as Doom.

This sounds a bit suspicious though. If we're confident that the neurons aren't complex enough to understand Doom, how can they be said to be complex enough to play it? Playing a game is a loose term but it seems difficult to say that it is playing something that it can't comprehend or interact with. By analogy, if there was a CNN between me and a game of Doom people would say "roenxi is cheating with an AI aim-bot", not 'roenxi is playing Doom".

The whole thing is still pretty cool though. Hopefully the neurons are having fun, I'm sure we all wish them what happiness they can muster.


There isn't enough input electrodes to encode a doom frame into the multi electrode array without compression.

That's all the artificial neural networks are doing.

If we could have gotten an MEA with 320x200 electrodes we wouldn't have used any encoding and just let the neurons figure it out. Instead it is an 8x8 grid.


We've got LLMs that seem to be smarter than anyone I'm talking to day-to-day and one useful model of them is just "compression". Compression is turning out to be a pretty key operation in intelligence and understanding (in fact, it seems to be intelligence and understanding in key ways). If we compress Doom into "shoot" and "the press buttons in the most favourable way to the player" then good compression could let a fair coin play Doom well if someone flips it fast enough.

I mean maybe ANN just means sampling the screen in which case I'm not sure why we're talking about it as a "network". But the type of compression seems critical.

Have I watched any of the videos or read the code? No I have not.


Yes...quite a shame that we never made a amalgamation cyborg horror out of parts and pieces of several different animals. That's definitely not the plot of every sci-fi horror movie.

This sounds nightmarish. Maybe we build a human centipede if we can get the VC funding next?

Or a Torment Nexus!!

man its shocking how every day we keep spiraling to that

Great idea, intense pain does provide a stronger response in the neuronal substrate. The prisoners, or uhh, “research subjects”, won’t mind. It’s for science. /s

I would have been quite happy to use my own brain as the computational substrate and I had more than a few other people keen to be the input and output parts of the system.

It's rather unfortunate that in the West it is impossible to get elective brain surgery. The countries that will do it have at best a spotty record. I talked to someone who had it done in Brazil and their electrodes became dislodged after a few months.

There is nothing new or horrifying about self experimentation. Newton for one did it in conditions that were far more dangerous: https://psmag.com/social-justice/newtons-needle-scientific-s...


I'm totally fine with consensual human experimentation that somehow threads the needle around exploitation of the poor - just not sure how we do the latter part short of requiring experimentees to pass a minimum net-worth threshold?

I think the closest would be: if anyone involved ever complains to authorities at all, everyone involved gets in trouble. If no one ever complains, no trouble. Everyone involved is forever at the mercy of everyone else involved.

I see perverse incentives to ablate complaint origination to expression pathways, or complaint system dependencies.

Or not so perverse, as this makes running these ventures much safer. Safety first!


Complaining to the authorities needs to come with a cost, otherwise people who don’t believe in the research or are looking for a payday will join just to complain

I consider that a feature in this idea, you have to believe in whatever you want to human experimentation for, enough to select who you include very carefully, and ultimately still take that risk.

Competitor will pay a former customer to complain.

Or...at the mercy of a scary man with a big wrench. Every single post you've put in this thread is a volatile mix of idealistic, naive, and sociopathic. So, obviously you'll be a tech CEO in 10 years.

??? This is the only post I made in this thread?

Make sure subjects actually pay, instead of getting paid for being a subject?

>What I wanted to do after dish brain pong, but never had the budget for, was using live animals as the computational substrate.

What does the ethical due diligence process look like, for something like this?


Haha, you made me laugh quite a bit, like ethical due diligence was even a bleep in the mental model of someone who talks like that about sentiment life forms.

I figure I'd get some mice fed to pythons.

You know, the ones swallowed head first and alive and then drown in stomach acid, the ones you can buy in pet any store?


Yeah, "nature is brutal, therefore what gives if i raise the bar in the suffering we bring to this world", great logic right there mate, specially when we all know such experiments are without the slightest shred of doubt aimed to end up using humans neurons, because those are the most powerful.

Also worth mentioning in district of Columbia and a few other places is illegal to sell live animals including mice, so there is some effort to do better about our behavior towards other living beings, unlike you.


The fact that they get eaten doesn't justify torturing mice by hooking their brains up to an LLM to make a slightly better climate catastrophe, or making them play doom.

Real talk, torturing animals is a hallmark of sociopaths. You should really seek professional help, and I say this 100% seriously.


Reminds me of the head transplant experiments. The stuff of nightmares but also fascinating.

> What I wanted to do after dish brain pong, but never had the budget for, was using live animals as the computational substrate.

This is horrific. What’s your end goal? Prisoners as data centers?

I hope you rethink this.


>What I wanted to do after dish brain pong, but never had the budget for, was using live animals as the computational substrate. Use the visual cortex of one as the input, send the neural spikes to a second animals frontal lobe for computation and finally send those signals to a third animals motor cortex to physically press buttons. It's a shame we never raised enough

It's amazing that someone would feel comfortable sharing this.


> What I wanted to do after dish brain pong, but never had the budget for, was using live animals

Torture, so casually mentioned. For what, I wonder.


When we can turn off distress and pain in farm animals we would have done more to improve well-being in the world than anyone alive today. Factory farms stop being an efficient evil and become the only moral way to produce meat.

And as a side effect we also get super intelligence on a substrate that is 10 orders of magnitude more energy efficient than silicon.


This is literal supervillian wire-heading type stuff. Poorly thought out ideas of a madman with no regard for the consequences; just vague claims about the definite super-good idea of direct brain state manipulation.

Do you mean we'll input their brains with an alternate reality or just removing their pain and distress signals?

Regardless of the answer. Lab meat still seems more ethical, it was never a sentient being in the first place.


That person has no idea or even care for what is or isn't ethical outside of other people maybe being upset about it and trying to stop them.

Gosh it's been years, but I think they did the dual animal experiment with rats about a decade ago. I'm likely misremembering but they tickled a rat in Japan and fed the impulses into the internet and had another rat in maybe Brazil move it's tail in response. From what I recall it did potentiate over time, implying learning at the more reflex level. Sorry I can't find the link though!

Hahaha I love how you made something that wouldn’t be harmful sound like a nightmare horror show.

Edit sweet Jesus never mind I missread it.


Someone should try to replace the neurons with urand and see if the chip can still play Doom, in the spirit of the qday prize winner.

Reminds me of the ship of theseus philosophical experiment where they replace neurons by logic gates one by one and ask when exactly consciousness stops existing.

i dont think its clear that logic gates can ever replace neurons in the first place

This reminds me of https://hackernews.hn/item?id=47897647, where a quantum computing demo worked equally well if you replaced the QC with an entropy source.

> but to my dilettante, critical eye it really looks as though there is a hell of a lot of learning happening in the convnet also.

Yeah it feels like they constructed the conclusion and worked backwards from there. I'm not seeing how their claim has much merit.


So far the wonders of claude/codex have been mostly constrained to applications that are built within the boundary conditions of existing libraries -- the models make direct use of the good work that humans have done to date to build Python, `requests`, `ffmpeg`, you name it.

But I'm excited for the (I think inevitable) stage where the shoggoth starts to reach outside those constraints -- rewriting, patching, renaming, rebuilding libraries, DLLs, binaries -- and we move into a regime where the libraries dissolve, the application floats on top of the shifting sands of an ever more efficient, secure, unified and totally inhuman technology stack.

Obviously this is a horrifying idea in some ways (interpretability, security etc), but it's also not obvious to me that it can't work, especially if there are dedicated, centralized efforts to do this. it's also not clear that interpretability is necessarily mutually exclusive with full slopification/machine rewrite of decades of foundational, incremental development


Journalists are so funny man. "Last week we told you that AI is fake and fraud. But we just learned something fascinating. A few months after everyone else was talking about it, we uncovered an amazing scoop: the companies which raised a lot of capital, are also doing tens of billions of dollars of revenue. So now we're starting to thing it might not all be a scam! Tune in next week for when we say it's all 100% fraudulent again."


Continue to believe that Cerebras is one of the most underrated companies of our time. It's a dinner-plate sized chip. It actually works. It's actually much faster than anything else for real workloads. Amazing


Nvidia seems cooked.

Google is crushing them on inference. By TPUv9, they could be 4x more energy efficient and cheaper overall (even if Nvidia cuts their margins from 75% to 40%).

Cerebras will be substantially better for agentic workflows in terms of speed.

And if you don't care as much about speed and only cost and energy, Google will still crush Nvidia.

And Nvidia won't be cheaper for training new models either. The vast majority of chips will be used for inference by 2028 instead of training anyway.

Nvidia has no manufacturing reliability story. Anyone can buy TSMC's output.

Power is the bottleneck in the US (and everywhere besides China). By TPUv9 - Google is projected to be 4x more energy efficient. It's a no-brainer who you're going with starting with TPUv8 when Google lets you run on-prem.

These are GW scale data centers. You can't just build 4 large-scale nuclear power plants in a year in the US (or anywhere, even China). You can't just build 4 GW solar farms in a year in the US to power your less efficient data center. Maybe you could in China (if the economics were on your side, but they aren't). You sure as hell can't do it anywhere else (maybe India).

What am I missing? I don't understand how Nvidia could've been so far ahead and just let every part of the market slip away.


> let every part of the market slip away.

Which part of the market has slept away, exactly ? Everything you wrote is supposition and extrapolation. Nvidia has a chokehold on the entire market. All other players still exist in the small pockets that Nvidia doesn’t have enough production capacity to serve. And their dev ecosystem is still so far ahead of anyone else. Which providers gets chosen to equip a 100k chips data center goes so far beyond the raw chip power.


If code is getting cheaper, making cuda alternatives and tooling should not be very far. I can’t see nvidia holding the position for much longer.


> Nvidia has a chokehold on the entire market.

You're obviously not looking at expected forward orders for 2026 and 2027.


I think most estimates have Nvidia at more or less stable share of CoWoS capacity (around 60%), which is ~doubling in '26.


Man I hope someone drinks Nvidia's milk shake. They need to get humbled back to the point where they're desperate to sell gpus to consumers again.

Only major road block is cuda...


The nice thing about modern LLMs is that it's a relatively large static use case. The compute is large and expensive enough you can afford to just write custom kernels, to a degree. It's not like CUDA where running on 1, 2, 8 GPUs and you need libraries that already do it all for you, and where researchers are building lots of different models.

There aren't all that many different small components between all of the different transformer based LLMs out there.


Yeah, given that frontier model training has shrunk down to a handful of labs it seems like a very solvable problem to just build the stack directly without CUDA. LLMs are mechanically simple and these labs have access to as much engineering muscle as they need. Pretty small price to pay to access cheaper hardware given that model runs cost on the order of $100M and every lab is paying Nvidia many multiples over that to fill up their new datacenters.


> What am I missing?

Largest production capacity maybe?

Also, market demand will be so high that every player's chips will be sold out.


> Largest production capacity maybe?

Anyone can buy TSMC's output...


Which I'm sure is 100% reserved through at least 2030.


Aren't they building new fabs, though? Or even those are already booked?


Can anyone buy TSMC though?


No. TSMC will not take the risk on allocating capacity to just anyone given the opportunity cost.


Not without an army


What puzzles me is that AMD can't secure any meaningful size of AI market. They missed this train badly.


> What am I missing?

VRAM capacity given the Cerebras/Groq architecture compared to Nvidia.

In parallel, RAM contracts that Nvidia has negotiated well into the future that other manufacturers have been unable to secure.


Well they `acquired` groq for a reason.


I believe they licensed smth from groq


It's "dinner-plate sized" because it's just a full silicon wafer. It's nice to see that wafer-scale integration is now being used for real work but it's been researched for decades.


I'm fascinated by how the economy is catching up to demand for inference. The vast majority of today's capacity comes from silicon that merely happens to be good at inference, and it's clear that there's a lot of room for innovation when you design silicon for inference from the ground up.

With CapEx going crazy, I wonder where costs will stabilize and what OpEx will look like once these initial investments are paid back (or go bust). The common consensus seems to be that there will be a rug pull and frontier model inference costs will spike, but I'm not entirely convinced.

I suspect it largely comes down to how much more efficient custom silicon is compared to GPUs, as well as how accurately the supply chain is able to predict future demand relative to future efficiency gains. To me, it is not at all obvious what will happen. I don't see any reason why a rug pull is any more or less likely than today's supply chain over-estimating tomorrow's capacity needs, and creating a hardware (and maybe energy) surplus in 5-10 years.


If history has taught us anything, “engineered systems” (like mainframes & hyper converged infrastructure) emerge at the start of a new computing paradigm … but long-term, commodity compute wins the game.


Chips and RAM grew in capacity but latency is mostly flat and interconnect power consumption grew a lot. So I think the paradigm changed. Even with newer ones like NVlink.

For 28 years Intel Xeon chips come with massive L2/L3. Nvidia is making bigger chips with last being 2 big chips interconnected. Cerebras saw the pattern and took it to the next level.

And the technology is moving 3D towards stacking layers on the wafer so there is room to grow that way, too.


I think that was true when you could rely on good old Moore’s law to make the heavy iron quickly obsolete but I also think those days are coming to an end


Not for what they are using it for. It is $1m+/chip and they can fit 1 of them in a rack. Rack space in DC's is a premium asset. The density isn't there. AI models need tons of memory (this product annoucement is case in point) and they don't have it, nor do they have a way to get it since they are last in line at the fabs.

Their only chance is an aquihire, but nvidia just spent $20b on groq instead. Dead man walking.


Oh don't worry. Ever since the power issue started developing rack space is no longer at a premium. Or at least, it's no longer the limiting factor. Power is.


The dirty secret is that there is plenty of power. But, it isn't all in one place and it is often stranded in DC's that can't do the density needed for AI compute.

Training models needs everything in one DC, inference doesn't.


Which DCs are these?


The real question is what’s their perf/dollar vs nvidia?


I guess it depends what you mean by "perf". If you optimize everything for the absolutely lowest latency given your power budget, your throughput is going to suck - and vice versa. Throughput is ultimately what matters when everything about AI is so clearly power-constrained, latency is a distraction. So TPU-like custom chips are likely the better choice.


> Throughput is ultimately what matters

I disagree. Yes it does matter, but because the popular interface is via chat, streaming the results of inference feels better to the squishy messy gross human operating the chat, even if it ends up taking longer. You can give all the benchmark results you want, humans aren't robots. They aren't data driven, they have feelings, and they're going to go with what feels better. That isn't true for all uses, but time to first byte is ridiculously important for human-computer interaction.


You just have to change the "popular interface" to something else. Chat is OK for trivia or genuinely time-sensitive questions, everything else goes through via email or some sort of webmail-like interface where requests are submitted and replies come back asynchronously. (This is already how batch APIs work, but they only offer a 50% discount compared to interactive, which is not enough to really make a good case for them - especially not for agentic workloads.)


By perf I mean how much does it cost to serve 1T model to 1M users at 50 tokens/sec.


All 1T models are not equal. E.g. how many active parameters? what's the native quantization? how long is the max context? Also, it's quite likely that some smaller models in common use are even sub-1T. If your model is light enough, the lower throughput doesn't necessarily hurt you all that much and you can enjoy the lightning-fast speed.


Just pick some reasonable values. Also, keep in mind that this hardware must still be useful 3 years from now. What’s going to happen to cerebras in 3 years? What about nvidia? Which one is a safer bet?

On the other hand, competition is good - nvidia can’t have the whole pie forever.


> Just pick some reasonable values.

And that's the point - what's "reasonable" depends on the hardware and is far from fixed. Some users here are saying that this model is "blazing fast" but a bit weaker than expected, and one might've guessed as much.

> On the other hand, competition is good - nvidia can’t have the whole pie forever.

Sure, but arguably the closest thing to competition for nVidia is TPUs and future custom ASICs that will likely save a lot on energy used per model inference, while not focusing all that much on being super fast.


AMD


That's coupling two different usecases.

Many coding usecases care about tokens/second, not tokens/dollar.


Exactly. They won't ever tell you. It is never published.

Let's not forget that the CEO is an SEC felon who got caught trying to pull a fast one.


Or Google TPUs.


TPUs don't have enough memory either, but they have really great interconnects, so they can build a nice high density cluster.

Compare the photos of a Cerebras deployment to a TPU deployment.

https://www.nextplatform.com/wp-content/uploads/2023/07/cere...

https://assets.bwbx.io/images/users/iqjWHBFdfxIU/iOLs2FEQxQv...

The difference is striking.


Oh wow the cabling in the first link is really sloppy!


Power/cooling is the premium.

Can always build a bigger hall


Exactly my point. Their architecture requires someone to invest the capex / opex to also build another hall.


How do you know the price of a unit ?


I remembered $1m from when I was in their booth at SC24, but when I just looked, I was wrong. It is worse...

https://www.datacenterdynamics.com/en/news/cerebras-unveils-...


You have no idea of the value this brings. Asml machines cost dozens of times more. So?


Ok, tell me the value.


Just wish they weren't so insanely expensive...


The bigger the chip, the worse the yield.


Cerebras has effectively 100% yield on these chips. They have an internal structure made by just repeating the same small modular units over and over again. This means they can just fuse off the broken bits without affecting overall function. It's not like it is with a CPU.


I think what you’re saying is that every wafer is usable, but won’t have the same performance characteristics, depending on how many bits are broken.

Doesn’t that just bucket wafers based on performance? Which effectively gives a yield to each bucket.


I suggest to read their website, they explain pretty well how they manage good yield. Though I’m not an expert in this field. I does make sense and I would be surprised if they were caught lying.


This comment doesn't make sense.


One wafer will turn into multiple chips.

Defects are best measured on a per-wafer basis, not per-chip. So if if your chips are huge and you can only put 4 chips on a wafer, 1 defect can cut your yield by 25%. If they're smaller and you fit 100 chips on a wafer, then 1 defect on the wafer is only cutting yield by 1%. Of course, there's more to this when you start reading about "binning", fusing off cores, etc.

There's plenty of information out there about how CPU manufacturing works, why defects happen, and how they're handled. Suffice to say, the comment makes perfect sense.


That's why you typically fuse off defective sub-units and just have a slightly slower chip. GPU and CPU manufacturers have done this for at least 15 years now, that I'm aware of.


Sure it does. If it’s many small dies on a wafer, then imperfections don’t ruin the entire batch; you just bin those components. If the entire wafer is a single die, you have much less tolerance for errors.


Although, IIUC, Cerebras expects some amount of imperfection and can adjust the hardware (or maybe the software) to avoid those components after they're detected. https://www.cerebras.ai/blog/100x-defect-tolerance-how-cereb...


You can just do dynamic binning.


Bigger chip = more surface area = higher chance for somewhere in the chip to have a manufacturing defect

Yields on silicon are great, but not perfect


Does that mean smaller chips are made from smaller wafers?


They can be made from large wafers. A defect typically breaks whatever chip it's on, so one defect on a large wafer filled with many small chips will still just break one chip of the many on the wafer. If your chips are bigger, one defect still takes out a chip, but now you've lost more of the wafer area because the chip is bigger. So you get a super-linear scaling of loss from defects as the chips get bigger.

With careful design, you can tolerate some defects. A multi-core CPU might have the ability to disable a core that's affected by a defect, and then it can be sold as a different SKU with a lower core count. Cerebras uses an extreme version of this, where the wafer is divided up into about a million cores, and a routing system that can bypass defective cores.

They have a nice article about it here: https://www.cerebras.ai/blog/100x-defect-tolerance-how-cereb...


Nope. They use the same size wafers and then just put more chips on a wafer.


So, does a wafer with a huge chip has more defects per area than a wafer with 100s of small chips?


There’s an expected amount of defects per wafer. If a chip has a defect, then it is lost (simplification). A wafer with 100 chips may lose 10 to defects, giving a yield of 90%. The same wafer but with 1000 smaller chips would still have lost only 10 of them, giving 99% yield.


As another comment referenced in this thread states, Cerebras seems to have solved by making their big chip a lot of much smaller cores that can be disposed of if they have errors.


Indeed, the original comment you replied to actually made no sense in this case. But there seemed to be some confusion in the thread, so I tried to clear that up. I hope I’ll get to talk with one of the cerebras engineers one day, that chip is really one of a kind.


Yes, amazing tech. You should join their Discord, it's pretty active these days!


You say this with such confidence and then ask if smaller chips require smaller wafers.


Technically, Cerebras solution is really cool. However, I am skeptical that it will be economically useful for models that are larger in size, as the requirements on the number of racks scales with the the size of the model to fit the weights in SRAM.


Yet investors keep backing NVIDIA.


At this point Tech investment and analysis is so divorced from any kind of reality that it's more akin to lemmings on the cliff than careful analysis of fundamentals


yep


Cerebras is a bit of a stunt like "datacenters in spaaaaace".

Terrible yield: one defect can ruin a whole wafer instead of just a chip region. Poor perf./cost (see above). Difficult to program. Little space for RAM.


They claim the opposite, though, saying the chip is designed to tolerate many defects and work around them.


Literally the only good piece of software left on windows. Masterpiece


And File Pilot :) https://filepilot.tech/ (t. FP Pro license holder)


I prefer XYPlorer.


I do like XYplorer as well and have a license for it too, but its startup time is just so slooooow that I can't reach for it like I reach for File Pilot.

Now written in twinBASIC for 64-bit support! https://hackernews.hn/item?id=42637089


I have it in my startup. No time wasted to open its window.


GOAT lubricant


Tried implementing this crap once. Never again


I love this… have been thinking about exactly this technology for years but combined with phased array directional loudspeaker and shotgun mic. Deploy during major political speech, instantly shut down brain of speaker, would appear to be an internal malfunction


Anyone else find that the iPhone camera app crashes about half the time these days? It's killing me...


Mine works fine.


Same


Google Maps is the mind killer. We all worry about social media controlling the way we think, feel, vote etc. but Google Maps literally manipulates where people physically go in real life, what they do on holiday, where they hang out, what they eat etc. I got so sick of feeling like a four point five star Google Maps automaton I had to mostly stop with it. In addition to OSM, personal recommendations etc. the best substitute for me for a 4.5 star review is my nose, eyes and ears


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: