NVIDIA GeForce RTX 4090 (24GB)

$1,599

The RTX 4090 is the gold standard for consumer GPU LLM inference with 24GB VRAM and 1 TB/s bandwidth. Handles 30B models at Q4 and delivers blazing-fast image generation.

Specifications

Memory 24GB GDDR6X
Memory Bandwidth 1008 GB/s
GPU Cores 16,384
CPU Cores N/A
TDP 450W
Max Model (Q4) 44B parameters
Max Model (Q8) 22B parameters
Performance Tier High
Category NVIDIA GPU

Performance Benchmarks

Llama 8B Q4 (tok/s) 110
SDXL 1024px (seconds) 3s
Flux 1024px (seconds) 8s

Pros

  • 24GB VRAM runs 30B+ models at Q4
  • Fastest consumer GPU for LLM inference
  • Excellent for image and video generation

Cons

  • 450W TDP requires beefy PSU
  • High price at $1,599
  • Large physical size (3-4 slot)

Compatible Models (Q4)

Models that fit in 24GB at Q4 quantization

No compatible models found.

Compatible at Q8

0 models can run at Q8 quantization