NVFP4 ยท W4A4 ยท vLLM ยท Blackwell

Rogue Quants

4-bit weight, 4-bit activation quantizations of Qwen3.5-family vision-language models, packed in compressed-tensors with GPTQ + MSE and tuned for vLLM on NVIDIA Blackwell (GB10, sm_121). Near-lossless, single-GPU deployable, several abliterated (uncensored) with Heretic.

๐ŸงŠ compressed-tensors โš™๏ธ GPTQ + MSE observer ๐Ÿš€ vLLM 0.23.0 ๐Ÿ–ฅ๏ธ NVIDIA GB10 ๐Ÿ”“ Abliterated variants
๐Ÿ“š Browse the NVFP4 Quants collection

The Models