Hacker News .hnnew | past | comments | ask | show | jobs | submit | iMil's commentslogin

Good call, I really hesitated between the X570 and the X99, are you using P2P?

$ nvidia-smi topo -p2p r

GPU0 GPU1

GPU0 X CNS

GPU1 CNS X

i guess not, i use llama.cpp with:

--spec-draft-n-max 3 --spec-type draft-mtp --split-mode tensor --tensor-split 1,1

and my (gen) tk/s are between 60-80 tk/s

will test this uncensored model and ngram added as well this weekend

btw, i also set my powerlimit to 220watt per card (with nvidia-smi) that will cost you around 1 tk/s but safe you a LOT of power and heat :)


CNS means Chipset not supported and I doubt it is the case, are you sure you are using the patched nvidia module? modinfo nvidia to check which one is loaded

I'm using bazzite on my ai-rig just because it has the gpu-optimized things setup (also nvidia-open). Looking at P2P seems to be available only for 90-versions of the nvidia rtx gpu line, not 80, and some versions of 50xx? (apparently the 5080?). Anyways, i downloaded that uncensored model and tweaked those kv settings etc. still getting 60-80tk/s but im able to get my context on 180224 now, used to be 131072 which gave me some trouble, this is already a win :)

Ok, i went through the P2P rabbithole, and incase anyone reads this, just use Fedora Rawhide, and run that repo install.sh mentioned in the blog post :) Now i have:

  GPU0 GPU1

 GPU0 X OK

 GPU1 OK X

I am actually surprised with the power draw, the box itself idles at 20W, which already amazes me for a Ryzen; when computing, I barely pass the 600W bar, and as I am not really using it to vibecode an entire system, I don't even notice the spikes on the power monitor (Shelly + homeassistant).

2xRTX5080 would be awesome. You'd only be able to run a q6, which it's already pretty good, but moreover you'd be able to use P2P and use Blackwell full speed, which I can't.

With 2 Blackwells, would make sense to run NVFP4 quants


sailor's goal is definitely not to bring security, we all perfectly know chroot is not the way of isolating a host filesystem from attackers. Instead, sailor provides a convenient way of testing environments without compromising your workstation / dev station filesystem.


Which was, as it happens, the reason chroot was invented (to test the installer of newer versions of UNIX without having a brand-new physical system). Better tooling around chroot is absolutely useful.

I guess it's a misnomer to use the word "container", since that usually means things like Linux containers with security isolation almost as good as OS virtualization.


Right, the "container" word is now vastly associated to docker, yet I picked a word I know IT people will get and which is generic enough. I'll think on an alternative buzzword ;) Thanks for your support!


The word you're looking for is "chroot".


There is a difference between not bringing in additional security and bringing anti-security. In my eyes, you are doing the latter.

Your default examples elevate privilege, not warning the user about this fact anywhere.


Duly noted, I just added a word about it on the GitHub page, and you're right, I should run the examples services with a dedicated user as I already do for the nginx process. Thanks for your feedback!


And so it is, I just commited changes so both PM2 and gunicorn are started with a specific user.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: