UD stands for "Unsloth-Dynamic" which upcasts important layers to higher bits. N...

CamperBob2 · 2026-02-03T17:24:32 1770139472

Please consider authoring a single, straightforward introductory-level page somewhere that explains what all the filename components mean, and who should use which variants.

The green/yellow/red indicators for different levels of hardware support are really helpful, but far from enough IMO.

danielhanchen · 2026-02-03T17:40:51 1770140451

Oh good idea! In general UD-Q4_K_XL (Unsloth Dynamic 4bits Extra Large) is what I generally recommend for most hardware - MXFP4_MOE is also ok

Keats · 2026-02-03T19:20:08 1770146408

Is there some indication on how the different bit quantization affect performance? IE I have a 5090 + 96GB so I want to get the best possible model but I don't care about getting 2% better perf if I only get 5 tok/s.

mirekrusin · 2026-02-03T20:24:55 1770150295

It takes download time + 1 minute to test speed yourself, you can try different quants, it's hard to write down a table because it depends on your system ie. ram clock etc. if you go out of gpu.

I guess it would make sense to have something like max context size/quants that fit fully on common configs with gpus, dual gpus, unified ram on mac etc.

Keats · 2026-02-03T21:06:00 1770152760

Testing speed is easy yes, I'm mostly wondering about the quality difference between Q6 vs Q8_K_XL for example.

danielhanchen · 2026-02-04T00:57:00 1770166620

I haven't done benchmarking yet (plan to do them), but it should be similar to our post on DeepSeek-V3.1 Dynamic GGUFs: https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs

segmondy · 2026-02-03T17:57:00 1770141420

The green/yellow/red indicators are based on what you set for your hardware on huggingface.

ranger_danger · 2026-02-03T22:47:35 1770158855

What is your definition of "important" in this context?

danielhanchen · 2026-02-04T00:56:09 1770166569

Oh we wrote about it here: https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs