Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

It depends on what you're trying to do, but I've got an M1, and doing inference with llama2-uncensored using Ollama, I get results within seconds.


Depends what you're doing, M1 Max is around a minute for 1 SDXL image and the machine feels like it's choking while it does it while a 3090 will do it in 9 seconds and doesn't feel like it's breaking a sweat.

Llama definite a bit of a different story though.


Im more thinking about the training side because it could be compelling to buy a beefily specced m3 max if it can replace what a dedicated gpu rig could do and also be a daily driver.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: