Hardware Recommender
Select a model or filter by your needs. We'll recommend the best hardware at every price point.
Frequently Asked Questions
How much RAM do I need to run a 70B parameter model?
For a 70B model at Q4 quantization, you need approximately 37GB of RAM/VRAM. A Mac Mini M4 Pro with 48GB or an RTX 5090 with 32GB VRAM (with system RAM offloading) would work.
Is Apple Silicon good for running local LLMs?
Yes. Apple Silicon's unified memory architecture allows the full system memory to be used for model inference, making it excellent for larger models. The M4 Pro with 48GB can run 70B models.
What is the best GPU for local LLM inference?
The NVIDIA RTX 5090 with 32GB GDDR7 offers the best consumer-grade performance at 1792 GB/s memory bandwidth. The RTX 4090 remains excellent value with 24GB VRAM.