iMil's comments

iMil · 2026-06-13T18:34:54 1781375694

Good call, I really hesitated between the X570 and the X99, are you using P2P?

cybertim · 2026-06-13T18:48:51 1781376531

$ nvidia-smi topo -p2p r

GPU0 GPU1

GPU0 X CNS

GPU1 CNS X

i guess not, i use llama.cpp with:

--spec-draft-n-max 3 --spec-type draft-mtp --split-mode tensor --tensor-split 1,1

and my (gen) tk/s are between 60-80 tk/s

will test this uncensored model and ngram added as well this weekend

btw, i also set my powerlimit to 220watt per card (with nvidia-smi) that will cost you around 1 tk/s but safe you a LOT of power and heat :)

iMil · 2026-06-13T19:09:48 1781377788

CNS means Chipset not supported and I doubt it is the case, are you sure you are using the patched nvidia module? modinfo nvidia to check which one is loaded

cybertim · 2026-06-13T19:42:29 1781379749

I'm using bazzite on my ai-rig just because it has the gpu-optimized things setup (also nvidia-open). Looking at P2P seems to be available only for 90-versions of the nvidia rtx gpu line, not 80, and some versions of 50xx? (apparently the 5080?). Anyways, i downloaded that uncensored model and tweaked those kv settings etc. still getting 60-80tk/s but im able to get my context on 180224 now, used to be 131072 which gave me some trouble, this is already a win :)

cybertim · 2026-06-17T10:36:49 1781692609

Ok, i went through the P2P rabbithole, and incase anyone reads this, just use Fedora Rawhide, and run that repo install.sh mentioned in the blog post :) Now i have:

  GPU0 GPU1

 GPU0 X OK

 GPU1 OK X

iMil · 2026-06-13T18:13:00 1781374380

I am actually surprised with the power draw, the box itself idles at 20W, which already amazes me for a Ryzen; when computing, I barely pass the 600W bar, and as I am not really using it to vibecode an entire system, I don't even notice the spikes on the power monitor (Shelly + homeassistant).

iMil · 2026-06-13T17:06:54 1781370414

2xRTX5080 would be awesome. You'd only be able to run a q6, which it's already pretty good, but moreover you'd be able to use P2P and use Blackwell full speed, which I can't.

kcb · 2026-06-14T01:56:12 1781402172

With 2 Blackwells, would make sense to run NVFP4 quants

iMil · 2026-06-13T16:12:36 1781367156

This one: https://es.aliexpress.com/item/1005010123289822.html?spm=a2g...

iMil · on Jan 16, 2016

sailor's goal is definitely not to bring security, we all perfectly know chroot is not the way of isolating a host filesystem from attackers. Instead, sailor provides a convenient way of testing environments without compromising your workstation / dev station filesystem.

geofft · on Jan 16, 2016

Which was, as it happens, the reason chroot was invented (to test the installer of newer versions of UNIX without having a brand-new physical system). Better tooling around chroot is absolutely useful.

I guess it's a misnomer to use the word "container", since that usually means things like Linux containers with security isolation almost as good as OS virtualization.

iMil · on Jan 16, 2016

Right, the "container" word is now vastly associated to docker, yet I picked a word I know IT people will get and which is generic enough. I'll think on an alternative buzzword ;) Thanks for your support!

jhardy54 · on Jan 16, 2016

The word you're looking for is "chroot".

q3k · on Jan 16, 2016

There is a difference between not bringing in additional security and bringing anti-security. In my eyes, you are doing the latter.

Your default examples elevate privilege, not warning the user about this fact anywhere.

iMil · on Jan 16, 2016

Duly noted, I just added a word about it on the GitHub page, and you're right, I should run the examples services with a dedicated user as I already do for the nginx process. Thanks for your feedback!

iMil · on Jan 17, 2016

And so it is, I just commited changes so both PM2 and gunicorn are started with a specific user.