> - 10 AIs running on 10 machines, each with 10 million GPUs
>
> OR
>
> - 10 million AIs running on 10 million machines, each with 10 GPUs
If we dramatically reduced the number of GPUs per AI instance, that would be great. But I think the difference in real life is not as extreme as you're making it. In your telling, the gpus-per-ai is reduced by one million. I'm not sure that (or anything even close to it) is within the realm of possibility for anthropic. The only reason anyone cares about them at all is because they have a frontier AI system. If they stopped, the AI frontier would be a bit farther back, maybe delayed by a few years, but Google and OpenAI would certainly not slow down 1000x, 100x or probably even 10x.
I would be surprised if Gemini could not run python in its web interface. Claude and ChatGPT can. And it makes them much more capable (e.g. you can ask claude to make manim animations for you and it will)
It's still true. Your metabolic system is probably not simpler after taking terzepatide. Although, just because it's not simpler doesn't mean it can't be better. I'm very glad for the C++ abstraction layer over assembly, even if the stack is more complicated than if it were just assembly
Trontinemab is in trials right now and has 92% of patients achieving low amyloid levels. And more people should be able to take it as it causes less brain swelling (ARIA-E). I'm unaffiliated, I just follow medical research in my free time. But I'm quite hopeful about this medication
I also know someone who's significantly better now than they were a few years ago thanks to alzheimer's medication. And Trontinemab, which is currently in phase III trials I believe, seems even better than what is publicly available as it crosses the blood-brain barrier more readily. We're entering a brighter future for alzheimer's patients.
The compile errors are great. I can change one function signature and have my output fill up with compile errors (that would all be runtime errors in python). Then I just let claude cook on fixing them. Any time you have to run your program and tell claude what’s wrong with it you’re wasting time, but because claude can run the compiler itself and iterate it’s much more able to complete a task without intervention.
I think most people perceive program semantic not only to be defined for the whole program as a whole, but also towards individual subexpressions. If the "LLM compiler" changes the output completely when a single word is added, then it is not deterministic for the subexpressions, that don't contain that addition.
Granted that is not the rigorous definition of determinism. Also some existing compilers are non-deterministic with such a definition, namely when "undefined behaviour" would be triggered. That's precisely the problem with UB.
The disconnect might be that there is a separation between "generating the final answer for the user" and "researching/thinking to get information needed for that answer". Saying "deeply" prompts it to read more of the file (as in, actually use the `read` tool to grab more parts of the file into context), and generate more "thinking" tokens (as in, tokens that are not shown to the user but that the model writes to refine its thoughts and improve the quality of its answer).
Running rust in wasm works really well. I feel like I'm the world's biggest cheerleader for it, but I was just amazed at how well it works. The one annoying thing is using web APIs through rust - you can do it with web-sys and js-sys, but it's rarely as ergonomic as it is in javascript. I usually end up writing wrapper libraries that make it easy, sometimes even easier than javascript (e.g. in rust I can use weblocks with RAII)
It does work well logically but performance is pretty bad. I had a nontrivial Rust project running on Cloudflare Workers, and CPU time very often clocked 10-60ms per request. This is >50x what the equivalent JS worker probably would've clocked. And in that environment you pay for CPU time...
The rust-js layer can be slow. But the actual rust code is much faster than the equivalent JS in my experience. My project would not be technically possible with javascript levels of performance
I did. I don't remember the specifics too well but a lot of it was cold starts. So just crunching the massive wasm binary was a big part of it. Otherwise it was the matchit library and js interop marshalling taking the rest of the time.
edit: and it cold started quite often. Even with sustained traffic from the same source it would cold start every few requests.
If we dramatically reduced the number of GPUs per AI instance, that would be great. But I think the difference in real life is not as extreme as you're making it. In your telling, the gpus-per-ai is reduced by one million. I'm not sure that (or anything even close to it) is within the realm of possibility for anthropic. The only reason anyone cares about them at all is because they have a frontier AI system. If they stopped, the AI frontier would be a bit farther back, maybe delayed by a few years, but Google and OpenAI would certainly not slow down 1000x, 100x or probably even 10x.
reply