NVIDIA GeForce RTX 4090 (24GB)
$1,599
The RTX 4090 is the gold standard for consumer GPU LLM inference with 24GB VRAM and 1 TB/s bandwidth. Handles 30B models at Q4 and delivers blazing-fast image generation.
Specifications
Memory 24GB GDDR6X
Memory Bandwidth 1008 GB/s
GPU Cores 16,384
CPU Cores N/A
TDP 450W
Max Model (Q4) 44B parameters
Max Model (Q8) 22B parameters
Performance Tier High
Category NVIDIA GPU
Performance Benchmarks
Llama 8B Q4 (tok/s) 110
SDXL 1024px (seconds) 3s
Flux 1024px (seconds) 8s
Pros
- 24GB VRAM runs 30B+ models at Q4
- Fastest consumer GPU for LLM inference
- Excellent for image and video generation
Cons
- 450W TDP requires beefy PSU
- High price at $1,599
- Large physical size (3-4 slot)
Compatible Models (Q4)
Models that fit in 24GB at Q4 quantization
No compatible models found.
Compatible at Q8
0 models can run at Q8 quantization