7,438 models · Updated daily

Find Hardware That Runs Any LLM Locally

Search 7,438+ AI models, get instant VRAM calculations, and find hardware recommendations for Llama, Qwen, DeepSeek & more — free, no signup.

32 guides  ·  21 hardware options  ·  Updated 2026  ·  Free, no signup

Top hardware picks for running LLMs locally

Hand-picked across budgets and ecosystems. Affiliate links to Amazon. No extra cost to you.

Intel Arc B580
12 GB · Best budget pick
Buy on Amazon →
Mac mini M4 (24 GB)
24 GB unified · Apple Silicon
Buy on Amazon →
RTX 5070 Ti
16 GB · Newest mid-tier
Buy on Amazon →
RTX 4090
24 GB · 70B-capable champion
Buy on Amazon →

Featured Guides

All 32 guides →
Start Here

New to Local AI? Start Here

Run AI like ChatGPT on your own PC — free, private, offline. Ollama in one click, first model running in 10 minutes. Complete beginner guide.

Read guide →
New

LFM2.5-8B-A1B Hardware Requirements

Liquid AI 8.3B MoE with 1.5B active params and 128K context. Q4 fits in ~6 GB; runs on an RTX 4060 8GB and any 12 GB+ card comfortably. Day-one MLX support.

Read guide →
Trending

DeepSeek Hardware Requirements

What GPU to run DeepSeek R1 7B, 14B, 32B, and 70B locally — per-GPU compatibility table.

Read guide →
New

Qwen3 Hardware Requirements

Run Qwen3 4B–32B locally on any GPU. VRAM requirements, quantization tips, and model picks.

Read guide →
New

Qwen3.5 Hardware Requirements

Qwen3.5-27B needs 15 GB at Q4 (RTX 4090). 35B-A3B MoE needs 20 GB. Thinking mode overhead and Ollama setup.

Read guide →
New

Qwen3.6 Hardware Requirements

Qwen3.6 improves on 3.5 with identical VRAM needs. 27B on RTX 4090, 35B-A3B on Mac Studio M4 Max 64 GB.

Read guide →
New

Kimi K2 Hardware Requirements

Kimi K2 is a 1T MoE model — it cannot run on consumer hardware. What CAN you run: Kimi-VL-A3B, distilled 7B–32B variants.

Read guide →
New

CPU Offloading for LLMs

Run models too large for your VRAM using -ngl layer splitting. Speed expectations, Ollama setup, and when to skip it.

Read guide →
New

Mistral Large Hardware Requirements

Mistral Large 2 (123B) needs 70 GB VRAM. Runs on Mac Studio M4 Max 128 GB. Best consumer alternative: Mistral Small 3.1 on RTX 4090.

Read guide →
New

Why Is My LLM Slow?

6 causes with specific fixes: memory bandwidth, CPU offloading penalty, KV cache bloat, wrong quantization, thermal throttling.

Read guide →
New

GPU VRAM Per Dollar Tier List

Every GPU ranked by VRAM per dollar. Intel Arc B580 leads on cost per GB. RTX 3090 used is the best all-around value.

Read guide →
New

How Much VRAM Do I Need?

Exact VRAM for every model size: 7B needs 5 GB, 13B needs 9 GB, 70B needs 38 GB at Q4. Includes KV cache overhead tables.

Read guide →
New

LLM Inference Speed: Tokens Per Second by GPU

Benchmark tok/s for RTX 4060 through RTX 5090. Why memory bandwidth matters more than CUDA cores.

Read guide →
New

Qwen3-30B-A3B Hardware Requirements

MoE misconception guide: 30B-A3B needs 17 GB VRAM, NOT 3 GB. GPU table, quant options, and 16 GB workaround.

Read guide →
New

Ollama vs LM Studio vs llama.cpp vs vLLM

Complete 4-way comparison. 10-row decision matrix. All three local runners use the same inference engine — speed is identical.

Read guide →
New

Llama 4 Scout Hardware Requirements

109B MoE model needs 57 GB VRAM at Q4. No single consumer GPU fits it — multi-GPU or Mac Studio M4 Max required.

Read guide →
New

Qwen3-235B Hardware Requirements

Largest open-weight model (235B MoE) needs 119 GB VRAM. Only Mac Studio M4 Ultra fits it. Consumer alternatives listed.

Read guide →
New

Local LLM Function Calling and Tool Use

Set up agents that call APIs and run code with Ollama. Best models: Qwen3-14B, Llama 3.1 8B. Python examples included.

Read guide →
New

Codestral Hardware Requirements

Mistral's coding model (22B) needs 13 GB VRAM. Any 16 GB GPU works. FIM support for Continue.dev and Cursor auto-complete.

Read guide →
New

Best GPU for Coding LLMs

GPU picks matched to the best coding models by VRAM tier. RTX 4060 Ti 16GB for Codestral; RTX 4090 for Qwen2.5-Coder-32B.

Read guide →
New

LLMs on Integrated Graphics: Intel & AMD iGPU

AMD Radeon 890M gets 12-18 tok/s on 7B models. Best models for no-GPU laptops and how to enable GPU acceleration.

Read guide →
New

Gemma 4 Hardware Requirements

Gemma 4 27B MoE runs in 8 GB VRAM at Q4 — RTX 4060 gets ~28 tok/s. Best model for 8 GB GPUs in 2026.

Read guide →

What Can I Run on My GPU?

VRAM tier guide: 8GB through 128GB — exact model names, quantization, and speeds.

Read guide →

Best GPU for LLMs

RTX 4060 to RTX 5090 vs Mac Studio — budget tiers compared.

Read guide →

CPU-Only LLM Inference

Run LLMs without a GPU. RAM requirements, speed benchmarks, and best models for CPU inference.

Read guide →
New

RTX 4090 for LLMs

Best single consumer GPU: 24 GB GDDR6X, 1008 GB/s, up to 110 tok/s. Full guide with model table.

Read guide →
New

Best LLMs to Run Locally

Top model picks by VRAM tier — Qwen3 8B, DeepSeek-R1-Distill, Gemma 3, Phi-4, Llama 3.3 70B.

Read guide →
New

Gemma 3 Hardware Requirements

Google's Gemma 3 (4B to 27B). The 27B fits in 16 GB at Q4. GPU picks and Ollama setup.

Read guide →
New

RTX 5070 for LLMs

Blackwell 12 GB GDDR7. Runs Qwen3 14B at Q4 — 33% more bandwidth than RTX 4070 at a lower price.

Read guide →
New

Best LLM for Coding Locally

Qwen3 14B, Codestral 22B, Phi-4 14B — ranked by GPU tier. Ollama commands + Continue.dev, Aider, Cursor setup.

Read guide →
Comparison

Qwen3 14B vs Phi-4 14B

Head-to-head at 12 GB VRAM. Both fit in Q4_K_M — which wins for coding, math, and everyday use?

Read guide →
Tutorial

LLMs on Windows: Complete Setup

Run Ollama and LM Studio on Windows 10/11. NVIDIA, AMD, Intel Arc GPU driver setup. Fix CUDA detection, CPU fallback, port conflicts.

Read guide →
Apple Silicon

Apple Silicon for Local LLMs

How M3 and M4 unified memory actually performs vs discrete GPUs. The Mac Studio M4 Max 64GB fits 70B at Q4_K_M — no $-equivalent NVIDIA setup does.

Read guide →
Budget

Best Budget GPU for LLMs

Sub-$500 picks that actually run modern models. RTX 3060 12GB still leads on $/GB VRAM in 2026.

Read guide →
Model Guide

Llama 4 Hardware Requirements

Llama 4 Scout (109B MoE) runs on 12-16 GB VRAM at Q4. Maverick needs 200+ GB. Which hardware to buy for Meta's latest model.

Read guide →
Tutorial

LM Studio vs Ollama vs Jan

LM Studio for beginners, Ollama for developers and servers, Jan for privacy. Which local AI app should you install first?

Read guide →
Model Guide

Gemma 4 Hardware Requirements

Gemma 4 27B MoE (4B active) fits in 8 GB VRAM at Q4 — RTX 4060 runs it at 28 tok/s. 31B dense needs 24 GB.

Read guide →
GPU Guide

RTX 5060 for LLMs: 8 GB Only

Only 8 GB VRAM — runs 7-8B models well, cannot run 14B. The RTX 5060 Ti 16 GB is the smarter AI buy.

Read guide →
Tutorial

Local AI Coding Assistant Setup

Free GitHub Copilot: VS Code + Continue + Ollama + Qwen3-Coder 14B. Works offline, 100% private, needs 16 GB VRAM.

Read guide →
Comparison

AMD vs NVIDIA for Local LLMs

AMD wins on VRAM per dollar; NVIDIA wins on Windows ease. ROCm 6.3 on Linux is finally production-ready. Full comparison for 2026.

Read guide →
GPU Guide

AMD RX 9060 XT LLM Guide

16 GB GDDR6 — double the VRAM of RTX 5060 for a small premium. Runs Qwen3 14B Q4 at 38 tok/s with ROCm 6.3.

Read guide →
Buying Guide

AMD Strix Halo Mini PC Guide

128 GB unified memory runs Llama 70B at 8-10 tok/s — about half the price of Mac Studio M4 Max 128 GB.

Read guide →
Tutorial

Run Claude Locally? Here's What Works

Claude weights are not public. Best local alternatives: Qwen3 14B matches Claude Haiku (16 GB), Qwen3 72B matches Sonnet (48 GB).

Read guide →
Guide

Best LLMs for 48 GB VRAM

Llama 3.3 70B Q4 fits at 42 GB, runs 20 tok/s on Mac M4 Pro. Qwen3 72B also fits. The full 70B tier breakdown.

Read guide →
Guide

Best LLMs for 24 GB VRAM

Qwen3 32B Q4 (19 GB) runs at 28 tok/s on RTX 4090. The jump from 16 GB that unlocks real 32B quality.

Read guide →
Tutorial

Open WebUI Setup Guide

ChatGPT-style interface for your local Ollama — RAG, voice, web search, multi-user. Docker install in one command.

Read guide →
Buying Guide

Best Budget GPU for LLMs

Entry tier: RTX 3060 12 GB. Budget tier: Intel Arc B580. Mid-budget: AMD RX 9060 XT 16 GB. Best value overall: RTX 5070 Ti.

Read guide →
Advanced

LLM Fine-Tuning Hardware

QLoRA on 7B needs 10 GB VRAM; 14B needs 24 GB. RTX 4090 is the best consumer GPU. AMD lacks tool support — NVIDIA only.

Read guide →
Advanced

Dual GPU Setup for 70B Models

Two RTX 3090s give 48 GB combined VRAM — runs Llama 70B at 8-10 tok/s. NVLink vs PCIe, PSU sizing, Ollama setup.

Read guide →
Tutorial

Local Voice AI Setup

Whisper + Ollama + Kokoro TTS: fully offline voice assistant in 2026. 8 GB VRAM, 2-5s latency, works air-gapped. Setup guide.

Read guide →
Comparison

RTX 4070 Ti Super vs RTX 4080

Both 16 GB, same models, 7% speed difference, a noticeable price gap. In 2026 the RTX 5070 Ti beats both by 33% for similar money.

Read guide →
Comparison

RTX 3090 vs RTX 4070 Ti Super

Used 24 GB vs new 16 GB. 70B models need 24 GB — which is the better buy for local LLMs?

Read guide →
Comparison

RTX 4070 vs 4080 vs 4090

12 GB vs 16 GB vs 24 GB VRAM. 32B models need 24 GB — which GPU should you buy?

Read guide →
New

AI on Your Gaming PC

Your gaming GPU already runs AI. RTX 4060 to RTX 4090 tier table — see exactly what you can run with your GPU right now.

Read guide →
New

Best LLMs for 8 GB VRAM

RTX 3060 / 4060 users: Qwen3 7B Q8 runs at 35 t/s, Phi-4 14B fits at Q4. Full model fit table and best picks.

Read guide →
New

Private Offline AI Setup

Run AI with zero data leaving your PC. Ollama + Qwen3 14B, no internet after setup, works on air-gapped machines.

Read guide →
New

Run ChatGPT Locally — Free

Open-source models now match GPT-4 quality. Free, private, no subscription. Ollama + Open WebUI in under 10 minutes.

Read guide →
New

Best LLMs for 16 GB VRAM

RTX 4060 Ti, 4080, 4070 Ti Super users: Qwen3 14B Q8 fits at 14.8 GB and runs at 30+ t/s. Gemma 3 27B Q4 also fits.

Read guide →
Comparison

RTX 5070 vs RTX 4070 for LLMs

Same 12 GB VRAM, same models. 5070 is 33% faster for less money. Upgrade verdict: buying new — yes; already own 4070 — no.

Read guide →
Comparison

RTX 5080 vs RTX 4090 for LLMs

RTX 5080 is 16 GB, RTX 4090 is 24 GB. Both fast for 7-14B — only the 4090 fits 32B at Q4. Buy the 5080 unless you need 32B+.

Read guide →
Comparison

RTX 4060 vs RTX 4070 for LLMs

8 GB vs 12 GB: the 4070 is nearly 2x faster AND runs 14B models the 4060 cannot. Worth the premium if you run anything above 7B.

Read guide →
New

Best LLMs for 24 GB VRAM

RTX 4090/3090 users: Qwen3 32B Q4 runs at 38 t/s. DeepSeek R1 32B fits. Full model fit table and top picks for the 32B sweet spot.

Read guide →
New

Run LLMs on Mac: Ollama Setup

M1 through M4 all run AI locally. Metal GPU acceleration is automatic. One command install, then ollama run qwen3:8b — done in 5 minutes.

Read guide →
New

M4 Mac Mini vs M4 Pro for LLMs

M4 24 GB vs M4 Pro 48 GB. M4 Pro is 2.3x faster and the only Mac Mini that runs 70B models. Clear buy recommendation inside.

Read guide →
New

Run LLMs on Linux: Ollama Setup

One curl command installs Ollama on Ubuntu, Fedora, or Arch. NVIDIA works instantly. AMD needs ROCm — step-by-step included.

Read guide →
Tutorial

How to Run DeepSeek R1 Locally

Step-by-step Ollama setup for DeepSeek R1 distills (8B–70B). Thinking mode explained, common issues fixed.

Read guide →
Tutorial

How to Run Qwen3 Locally

Ollama setup for Qwen3 0.6B–32B and MoE. Thinking mode per query, LM Studio alternative, Windows/Mac/Linux.

Read guide →
Tutorial

Open WebUI + Ollama Setup

Get a ChatGPT-like browser interface for your local LLMs in 5 minutes. Free, private, no API key required.

Read guide →
Tutorial

How to Run Llama 3 Locally

Run Meta Llama 3.1 8B or Llama 3.3 70B via Ollama. Step-by-step for Windows, Mac, Linux. Hardware picks included.

Read guide →
Tutorial

How to Run Mistral Locally

Run Mistral 7B (4.5 GB), Nemo 12B, or Small 22B via Ollama. Works on any 8 GB+ GPU. Windows, Mac, Linux.

Read guide →
Tutorial

How to Run Gemma 3 Locally

Google's Gemma 3 27B fits in 16 GB VRAM. Multimodal — process images locally. Ollama setup guide.

Read guide →
Tutorial

How to Run Phi-4 Locally

Microsoft's Phi-4 14B needs only 9-10 GB VRAM and beats Llama 3.1 8B on reasoning. Ollama setup guide.

Read guide →
Tutorial

How to Run Llama 4 Scout Locally

Llama 4 Scout needs 58 GB VRAM due to MoE — explains why, dual-GPU llama.cpp setup, and alternatives for 8-24 GB GPUs.

Read guide →
Tutorial

Local AI Coding Assistant Setup

VS Code + Ollama + Continue.dev — free GitHub Copilot alternative. Chat model + FIM autocomplete configured in under 10 minutes.

Read guide →
Comparison

RTX 5070 Ti vs RTX 5080 for LLMs

Both have 16 GB GDDR7 and run identical models. 5080 is 7% faster. Buy the 5070 Ti unless budget is no concern.

Read guide →
Tutorial

LM Studio: Complete Setup Guide

Desktop GUI for running LLMs offline. Download, load a model, and chat in minutes. No terminal needed. GPU acceleration, model browser, OpenAI-compatible API.

Read guide →
New

Best LLMs for 32 GB VRAM (RTX 5090)

RTX 5090 users: Qwen3 32B Q8 runs at 45+ t/s. QwQ-32B reasoning model fits. Full model fit table for the 32 GB tier.

Read guide →
New

Best LLMs for 12 GB VRAM (RTX 4070 / 5070)

Qwen3 14B Q4 runs at 30 t/s on RTX 4070. Phi-4 14B fits. Gemma 3 12B is fastest. The 12 GB sweet spot explained.

Read guide →
Reference

Ollama Commands Cheat Sheet

Every Ollama CLI command in one place: run, pull, list, API endpoints, Modelfile guide, environment variables, one-liners.

Read guide →
New

AMD RX 9070 XT for Local LLMs

16 GB GDDR6, 896 GB/s — matches RTX 5080 LLM throughput. RDNA 4 ROCm 6.2 setup guide and model fit table.

Read guide →
Reality Check

DeepSeek V3: Can You Run It Locally?

DeepSeek V3 (685B) needs 390 GB VRAM — no consumer GPU can run it. Honest analysis and the best consumer alternatives.

Read guide →
New

RTX 5060 Ti for LLMs: 8 GB vs 16 GB

Always buy the 16 GB variant. Qwen3 14B Q4 at 35+ t/s. The RTX 5070 is 2x faster — worth considering.

Read guide →
Reference

LLM System Requirements: CPU, RAM, PSU

GPU VRAM is the bottleneck but you also need 32 GB RAM, NVMe SSD, and a right-sized PSU. Complete spec tables per tier.

Read guide →
Tutorial

Run Qwen3 30B MoE Locally

30B total params, only 3B active — runs at 30 t/s on RTX 4090 while fitting in 20 GB. Thinking mode guide and performance vs dense comparison.

Read guide →
Developer

Ollama Python API Guide

Use Ollama from Python: ollama library, OpenAI-compatible endpoint, streaming, embeddings, async. Working code examples for all patterns.

Read guide →
Guide

How to Run 70B Models Locally

Llama 3.3 70B needs 42 GB VRAM. RTX 4090 alone is not enough. Mac M4 Pro 48 GB is the best consumer option. Dual GPU setup also covered.

Read guide →
Comparison

RTX 5090 vs RTX 4090 for LLMs

5090 is 78% faster and has 32 GB vs 24 GB. Enables Qwen3 32B at Q6. Worth the premium? Speed table, model fit, buy verdict.

Read guide →
Comparison

Mac Mini vs Mac Studio for LLMs

M4 Pro 48 GB runs 70B at 12 t/s. M4 Max 64 GB runs it at 20 t/s. Full comparison and buy recommendation.

Read guide →
New

Best LLMs for 6 GB VRAM

RTX 3060 6GB and GTX 1660 Super: Qwen3 7B Q4 at 30 t/s. Limited to 7-8B but surprisingly capable. Upgrade analysis vs 8 GB included.

Read guide →
Advanced

llama.cpp Guide: Run Without Ollama

Direct GGUF inference with full control. CUDA/Metal/ROCm build, GPU layer flags, server mode, performance tuning. For power users.

Read guide →
Comparison

RTX 5070 vs RTX 5070 Ti for LLMs

The 5070 is faster than the 5070 Ti on shared models — the 5070 Ti just has more VRAM. The price step up buys 14B Q8 capability, not speed.

Read guide →
Tutorial

RAG: Chat With Your Own Documents

Add document search to your local LLM: Open WebUI (no code), AnythingLLM (one app), or Python LangChain. Hardware requirements and embedding model picks.

Read guide →
Use Case

Best LLMs for Writing Locally

Llama 3.3 70B for fiction, Qwen3 14B Q8 for content, Mistral Small 22B for creative. Temperature guide and system prompts for each use case.

Read guide →
FAQ

Is 8 GB VRAM Enough for AI?

Yes — Qwen3 7B at 35 t/s fits comfortably, but 14B models at Q4 do not. Upgrade analysis, Stable Diffusion verdict, best 8 GB GPUs.

Read guide →
Tutorial

Vision/Multimodal LLMs Locally

Gemma 3 4B runs image + text in 4 GB VRAM. Describe images, extract text from screenshots, analyze charts — all private with Ollama.

Read guide →
Advanced

Home Server LLM Setup Guide

Run Ollama 24/7 on a home server. Network access, Tailscale VPN, Open WebUI frontend. Power cost: Mac Mini M4 Pro is 10x cheaper to run than a gaming PC.

Read guide →
Tutorial

Run Ollama in Docker (GPU Support)

NVIDIA GPU passthrough in 1 extra flag. Docker Compose file with Open WebUI included. 3 commands from zero to running model.

Read guide →
Reference

How to Speed Up Ollama

Flash attention, optimized context size, Q4_K_M, KEEP_ALIVE=-1. Full environment variables reference and benchmark commands.

Read guide →
Sort

631 shown

Qwen/Qwen3-VL-2B-Instruct

Qwen/Qwen3-VL-2B-Instruct

Multimodal Text Image
2B 186,571,456 384

google/electra-base-discriminator

google/electra-base-discriminator

other
0.11B 53,732,074 102

BAAI/bge-small-en-v1.5

BAAI/bge-small-en-v1.5

feature-extraction
0.033B 31,815,585 450

BAAI/bge-m3

BAAI/bge-m3

sentence-similarity
0.567B 20,286,678 2,970

openai-community/gpt2

openai-community/gpt2

Text
0.117B 16,125,428 3,226

BAAI/bge-large-en-v1.5

BAAI/bge-large-en-v1.5

feature-extraction
0.335B 14,319,659 657

Qwen/Qwen2.5-7B-Instruct

Qwen/Qwen2.5-7B-Instruct

Text
7B 13,708,672 1,253

deepseek-ai/DeepSeek-V3.2

deepseek-ai/DeepSeek-V3.2

Text
671B 11,276,220 1,427

Qwen/Qwen3-4B-Instruct-2507

Qwen/Qwen3-4B-Instruct-2507

Text
4B 10,340,261 829

BAAI/bge-reranker-v2-m3

BAAI/bge-reranker-v2-m3

text-classification
0.567B 9,843,903 976

meta-llama/Llama-3.1-8B-Instruct

meta-llama/Llama-3.1-8B-Instruct

Text
8B 9,668,434 5,779

Qwen/Qwen2.5-1.5B-Instruct

Qwen/Qwen2.5-1.5B-Instruct

Text
1.5B 9,615,295 683

Qwen/Qwen2.5-3B-Instruct

Qwen/Qwen2.5-3B-Instruct

Text
3B 9,372,290 450

Qwen/Qwen2.5-VL-7B-Instruct

Qwen/Qwen2.5-VL-7B-Instruct

Multimodal Text Image
7B 8,918,399 1,516

BAAI/bge-base-en-v1.5

BAAI/bge-base-en-v1.5

feature-extraction
0.109B 8,098,766 414

facebook/opt-125m

facebook/opt-125m

Text
0.125B 8,008,846 249

google/gemma-4-31B-it

google/gemma-4-31B-it

Multimodal Text Image
31B 7,776,034 2,480

trl-internal-testing/tiny-Qwen2ForCausalLM-2.5

trl-internal-testing/tiny-Qwen2ForCausalLM-2.5

Text
0.001B 7,476,564 6

Qwen/Qwen3.5-9B

Qwen/Qwen3.5-9B

Multimodal Text Image
9B 7,471,967 1,376

openai/gpt-oss-20b

openai/gpt-oss-20b

Text
20B 6,942,836 4,581

google/gemma-4-26B-A4B-it

google/gemma-4-26B-A4B-it

Multimodal Text Image
26B 6,211,893 862

Qwen/Qwen2.5-0.5B-Instruct

Qwen/Qwen2.5-0.5B-Instruct

Text
0.5B 6,191,102 507

meta-llama/Llama-3.2-1B-Instruct

meta-llama/Llama-3.2-1B-Instruct

Text
1B 6,178,296 1,392

Qwen/Qwen3-Embedding-0.6B

Qwen/Qwen3-Embedding-0.6B

feature-extraction
0.6B 5,832,490 1,005

google/gemma-4-E4B-it

google/gemma-4-E4B-it

any-to-any
4B 5,045,802 899

google/vit-base-patch16-224

google/vit-base-patch16-224

image-classification
0.086B 4,783,900 957

dphn/dolphin-2.9.1-yi-1.5-34b

dphn/dolphin-2.9.1-yi-1.5-34b

Text
34B 4,698,540 62

Qwen/Qwen3.5-4B

Qwen/Qwen3.5-4B

Multimodal Text Image
4B 4,668,408 511

Qwen/Qwen3-VL-8B-Instruct

Qwen/Qwen3-VL-8B-Instruct

Multimodal Text Image
8B 4,566,829 887

openai/gpt-oss-120b

openai/gpt-oss-120b

Text
120B 4,147,394 4,753

Qwen/Qwen2-VL-2B-Instruct

Qwen/Qwen2-VL-2B-Instruct

Multimodal Text Image
2B 4,039,269 500

deepseek-ai/DeepSeek-R1

deepseek-ai/DeepSeek-R1

Text
671B 4,023,498 13,317

Qwen/Qwen2-1.5B-Instruct

Qwen/Qwen2-1.5B-Instruct

Text
1.5B 3,972,712 162

Qwen/Qwen3.5-35B-A3B

Qwen/Qwen3.5-35B-A3B

Multimodal Text Image
35B 3,691,140 1,413

google/mobilebert-uncased

google/mobilebert-uncased

other
0.025B 3,679,834 70

Qwen/Qwen2.5-VL-3B-Instruct

Qwen/Qwen2.5-VL-3B-Instruct

Multimodal Text Image
3B 3,561,479 641

unsloth/gemma-4-26B-A4B-it-GGUF

unsloth/gemma-4-26B-A4B-it-GGUF

Multimodal Text Image
26B 3,557,393 647

microsoft/table-transformer-detection

microsoft/table-transformer-detection

object-detection
0.04B 3,463,448 417

meta-llama/Meta-Llama-3-8B

meta-llama/Meta-Llama-3-8B

Text
8B 3,458,214 6,529

mistralai/Mistral-7B-Instruct-v0.3

mistralai/Mistral-7B-Instruct-v0.3

other
7B 3,422,264 2,547

google/vit-base-patch16-224-in21k

google/vit-base-patch16-224-in21k

image-feature-extraction
0.086B 3,399,606 404

Qwen/Qwen3.5-27B

Qwen/Qwen3.5-27B

Multimodal Text Image
27B 3,327,398 966

google/gemma-4-E2B-it

google/gemma-4-E2B-it

any-to-any
2B 3,307,386 562

TinyLlama/TinyLlama-1.1B-Chat-v1.0

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text
1.1B 3,205,477 1,575

EleutherAI/pythia-160m

EleutherAI/pythia-160m

Text
0.16B 3,152,957 41

Qwen/Qwen3.5-0.8B

Qwen/Qwen3.5-0.8B

Multimodal Text Image
0.8B 3,060,933 515

BAAI/bge-reranker-base

BAAI/bge-reranker-base

text-classification
0.278B 3,042,804 233

llava-hf/llava-1.5-7b-hf

llava-hf/llava-1.5-7b-hf

Multimodal Text Image
7B 2,951,975 359

distilbert/distilgpt2

distilbert/distilgpt2

Text
0.082B 2,736,040 625

google/gemma-3-12b-it

google/gemma-3-12b-it

Multimodal Text Image
12B 2,726,598 712

Qwen/Qwen3-Coder-30B-A3B-Instruct

Qwen/Qwen3-Coder-30B-A3B-Instruct

Code Text
30B 2,719,917 1,039

vikhyatk/moondream2

vikhyatk/moondream2

Multimodal Text Image
1.8B 2,706,903 1,408

moonshotai/Kimi-K2.5

moonshotai/Kimi-K2.5

Multimodal Text Image
1000B 2,659,891 2,775

Qwen/Qwen2.5-14B-Instruct

Qwen/Qwen2.5-14B-Instruct

Text
14B 2,658,219 331

hmellor/tiny-random-LlamaForCausalLM

hmellor/tiny-random-LlamaForCausalLM

Text
0.0011B 2,549,585 0

microsoft/TRELLIS-image-large

microsoft/TRELLIS-image-large

image-to-3d
1.1B 2,477,413 639

microsoft/deberta-v3-base

microsoft/deberta-v3-base

fill-mask
0.184B 2,463,975 418

deepseek-ai/DeepSeek-OCR

deepseek-ai/DeepSeek-OCR

Multimodal Text Image
3B 2,441,303 3,222

Qwen/Qwen2-VL-7B-Instruct

Qwen/Qwen2-VL-7B-Instruct

Multimodal Text Image
7B 2,401,674 1,274

Qwen/Qwen3.6-35B-A3B

Qwen/Qwen3.6-35B-A3B

Multimodal Text Image
35B 2,397,446 1,572

openai-community/gpt2-large

openai-community/gpt2-large

Text
0.774B 2,348,019 348

BAAI/bge-small-zh-v1.5

BAAI/bge-small-zh-v1.5

feature-extraction
0.024B 2,334,778 112

Qwen/Qwen3-VL-4B-Instruct

Qwen/Qwen3-VL-4B-Instruct

Multimodal Text Image
4B 2,299,591 378

Qwen/Qwen3-VL-Embedding-2B

Qwen/Qwen3-VL-Embedding-2B

sentence-similarity
2B 2,298,617 394

google/gemma-3-4b-it

google/gemma-3-4b-it

Multimodal Text Image
4B 2,288,368 1,318

mistralai/Mistral-7B-Instruct-v0.2

mistralai/Mistral-7B-Instruct-v0.2

Text
7B 2,256,011 3,129

Qwen/Qwen2.5-Coder-7B-Instruct

Qwen/Qwen2.5-Coder-7B-Instruct

Code Text
7B 2,191,051 703

Qwen/Qwen3.6-35B-A3B-FP8

Qwen/Qwen3.6-35B-A3B-FP8

Multimodal Text Image
35B 2,153,466 198

nvidia/Gemma-4-31B-IT-NVFP4

nvidia/Gemma-4-31B-IT-NVFP4

Text
31B 2,129,147 445

google/siglip-so400m-patch14-384

google/siglip-so400m-patch14-384

zero-shot-image-classification
0.878B 2,122,773 671

Qwen/Qwen3-0.6B-FP8

Qwen/Qwen3-0.6B-FP8

Text
0.6B 2,116,821 59

meta-llama/Llama-3.2-3B-Instruct

meta-llama/Llama-3.2-3B-Instruct

Text
3B 2,097,830 2,119

unsloth/gemma-4-31B-it-GGUF

unsloth/gemma-4-31B-it-GGUF

Multimodal Text Image
31B 2,039,712 385

stabilityai/stable-diffusion-xl-base-1.0

stabilityai/stable-diffusion-xl-base-1.0

Image
3.5B 2,034,986 7,682

unsloth/Qwen3.6-35B-A3B-GGUF

unsloth/Qwen3.6-35B-A3B-GGUF

Multimodal Text Image
35B 2,001,316 895

google/siglip-base-patch16-224

google/siglip-base-patch16-224

zero-shot-image-classification
0.203B 1,985,237 82

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Text
8B 1,958,814 857

Qwen/Qwen2.5-0.5B

Qwen/Qwen2.5-0.5B

Text
0.5B 1,949,479 399

Qwen/Qwen3-Embedding-8B

Qwen/Qwen3-Embedding-8B

feature-extraction
8B 1,932,849 670

RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic

RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic

Text
1B 1,926,754 4

Qwen/Qwen2.5-14B-Instruct-AWQ

Qwen/Qwen2.5-14B-Instruct-AWQ

Text
14B 1,895,974 35

unsloth/gemma-4-E4B-it-GGUF

unsloth/gemma-4-E4B-it-GGUF

Multimodal Text Image
4B 1,886,371 355

Qwen/Qwen3-ASR-1.7B

Qwen/Qwen3-ASR-1.7B

automatic-speech-recognition
1.7B 1,880,149 776

google/flan-t5-base

google/flan-t5-base

other
0.25B 1,843,624 1,073

Qwen/Qwen3-Embedding-4B

Qwen/Qwen3-Embedding-4B

feature-extraction
4B 1,796,983 261

meta-llama/Llama-2-7b-hf

meta-llama/Llama-2-7b-hf

Text
7B 1,782,446 2,297

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

text-to-speech
1.7B 1,776,544 1,445

Qwen/Qwen2.5-Coder-7B

Qwen/Qwen2.5-Coder-7B

Code Text
7B 1,750,121 142

Qwen/Qwen3.5-2B

Qwen/Qwen3.5-2B

Multimodal Text Image
2B 1,743,120 266

Qwen/Qwen2-VL-7B-Instruct-AWQ

Qwen/Qwen2-VL-7B-Instruct-AWQ

Multimodal Text Image
7B 1,723,622 49

nvidia/bigvgan_v2_22khz_80band_256x

nvidia/bigvgan_v2_22khz_80band_256x

audio-to-audio
1,697,068 27

microsoft/Phi-3.5-vision-instruct

microsoft/Phi-3.5-vision-instruct

Multimodal Text Image
1,658,912 733

EleutherAI/pythia-70m-deduped

EleutherAI/pythia-70m-deduped

Text
1,653,170 28

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract

fill-mask
1,647,113 91

meta-llama/Meta-Llama-3-8B-Instruct

meta-llama/Meta-Llama-3-8B-Instruct

Text
8B 1,629,384 4,500

microsoft/table-transformer-structure-recognition-v1.1-all

microsoft/table-transformer-structure-recognition-v1.1-all

object-detection
1,626,098 83

apple/OpenELM-1_1B-Instruct

apple/OpenELM-1_1B-Instruct

Text
1B 1,606,041 75

microsoft/Phi-4-mini-instruct

microsoft/Phi-4-mini-instruct

Code Text
3.8B 1,564,970 733

Qwen/Qwen3.6-27B-FP8

Qwen/Qwen3.6-27B-FP8

Multimodal Text Image
27B 1,564,153 177

meta-llama/Llama-3.1-8B

meta-llama/Llama-3.1-8B

Text
8B 1,557,284 2,177

deepseek-ai/DeepSeek-OCR-2

deepseek-ai/DeepSeek-OCR-2

Multimodal Text Image
1,512,165 936

meta-llama/Llama-3.2-1B

meta-llama/Llama-3.2-1B

Text
1B 1,509,915 2,377

Qwen/Qwen3-TTS-12Hz-1.7B-Base

Qwen/Qwen3-TTS-12Hz-1.7B-Base

other
1.7B 1,508,494 384

Qwen/Qwen3.5-35B-A3B-FP8

Qwen/Qwen3.5-35B-A3B-FP8

Multimodal Text Image
35B 1,502,655 147

Qwen/Qwen3-Reranker-0.6B

Qwen/Qwen3-Reranker-0.6B

text-ranking
0.6B 1,501,438 344

zai-org/GLM-5-FP8

zai-org/GLM-5-FP8

Text
744B 1,492,611 172

opendatalab/MinerU2.5-2509-1.2B

opendatalab/MinerU2.5-2509-1.2B

Multimodal Text Image
1.2B 1,488,906 356

HuggingFaceTB/SmolLM2-135M-Instruct

HuggingFaceTB/SmolLM2-135M-Instruct

Text
0.135B 1,460,127 312

stable-diffusion-v1-5/stable-diffusion-v1-5

stable-diffusion-v1-5/stable-diffusion-v1-5

Image
1,457,151 1,090

nvidia/DeepSeek-R1-0528-NVFP4-v2

nvidia/DeepSeek-R1-0528-NVFP4-v2

Text
671B 1,449,051 23

microsoft/mdeberta-v3-base

microsoft/mdeberta-v3-base

fill-mask
1,444,943 220

Qwen/Qwen2.5-32B-Instruct-AWQ

Qwen/Qwen2.5-32B-Instruct-AWQ

Text
32B 1,417,403 99

Qwen/Qwen3.5-27B-FP8

Qwen/Qwen3.5-27B-FP8

Multimodal Text Image
27B 1,415,298 132

llamafactory/tiny-random-Llama-3

llamafactory/tiny-random-Llama-3

Text
1,403,809 3

Qwen/Qwen3.5-397B-A17B-FP8

Qwen/Qwen3.5-397B-A17B-FP8

Multimodal Text Image
397B 1,392,529 165

Qwen/Qwen3-VL-Embedding-8B

Qwen/Qwen3-VL-Embedding-8B

sentence-similarity
8B 1,389,404 396

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Text
30B 1,386,050 725

Qwen/Qwen3-VL-235B-A22B-Instruct

Qwen/Qwen3-VL-235B-A22B-Instruct

Multimodal Text Image
235B 1,385,200 383

nvidia/Kimi-K2.5-NVFP4

nvidia/Kimi-K2.5-NVFP4

Text
1000B 1,353,823 81

Qwen/Qwen2.5-7B-Instruct-AWQ

Qwen/Qwen2.5-7B-Instruct-AWQ

Text
7B 1,338,985 40

Qwen/Qwen3-VL-32B-Instruct

Qwen/Qwen3-VL-32B-Instruct

Multimodal Text Image
32B 1,333,168 198

Qwen/Qwen3-30B-A3B

Qwen/Qwen3-30B-A3B

Text
30B 1,324,358 884

Qwen/Qwen2.5-Coder-32B-Instruct

Qwen/Qwen2.5-Coder-32B-Instruct

Code Text
32B 1,320,337 2,014

Tongyi-MAI/Z-Image-Turbo

Tongyi-MAI/Z-Image-Turbo

Image
1,318,361 4,550

cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

Multimodal Text Image
26B 1,312,093 57

google/embeddinggemma-300m

google/embeddinggemma-300m

sentence-similarity
1,307,858 1,622

microsoft/table-transformer-structure-recognition

microsoft/table-transformer-structure-recognition

object-detection
1,302,817 214

google/t5gemma-s-s-prefixlm

google/t5gemma-s-s-prefixlm

Text
1,271,969 3

bigscience/bloomz-560m

bigscience/bloomz-560m

Code Text
1,263,024 137

HuggingFaceTB/SmolLM2-135M

HuggingFaceTB/SmolLM2-135M

Text
0.135B 1,235,151 189

deepseek-ai/DeepSeek-V3

deepseek-ai/DeepSeek-V3

Text
671B 1,207,791 4,064

OpenGVLab/InternVL2-2B

OpenGVLab/InternVL2-2B

Multimodal Text Image
2B 1,190,819 80

BAAI/bge-multilingual-gemma2

BAAI/bge-multilingual-gemma2

feature-extraction
1,174,275 200

microsoft/VibeVoice-Realtime-0.5B

microsoft/VibeVoice-Realtime-0.5B

text-to-speech
0.5B 1,164,266 1,215

cyankiwi/gemma-4-31B-it-AWQ-4bit

cyankiwi/gemma-4-31B-it-AWQ-4bit

Multimodal Text Image
31B 1,159,491 37

kaitchup/Phi-3-mini-4k-instruct-gptq-4bit

kaitchup/Phi-3-mini-4k-instruct-gptq-4bit

Text
1,140,998 2

Qwen/Qwen3-30B-A3B-Instruct-2507

Qwen/Qwen3-30B-A3B-Instruct-2507

Text
30B 1,131,302 805

google/electra-small-discriminator

google/electra-small-discriminator

other
1,125,936 38

mistralai/Mistral-Small-3.2-24B-Instruct-2506

mistralai/Mistral-Small-3.2-24B-Instruct-2506

other
24B 1,122,536 584

unsloth/gemma-4-E2B-it-GGUF

unsloth/gemma-4-E2B-it-GGUF

Multimodal Text Image
2B 1,120,691 170

mistralai/Voxtral-Mini-4B-Realtime-2602

mistralai/Voxtral-Mini-4B-Realtime-2602

automatic-speech-recognition
4B 1,104,493 836

allenai/longformer-base-4096

allenai/longformer-base-4096

other
1,089,592 227

Qwen/Qwen3.6-27B

Qwen/Qwen3.6-27B

Multimodal Text Image
27B 1,070,778 1,077

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Code Text
1,065,957 590

Qwen/Qwen2.5-Coder-14B-Instruct

Qwen/Qwen2.5-Coder-14B-Instruct

Code Text
14B 1,059,603 151

BAAI/bge-reranker-large

BAAI/bge-reranker-large

feature-extraction
1,048,351 456

Qwen/Qwen2.5-Coder-32B-Instruct-AWQ

Qwen/Qwen2.5-Coder-32B-Instruct-AWQ

Code Text
32B 1,021,659 35

nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Multimodal Text Image
8B 1,020,219 177

Qwen/Qwen3.5-122B-A10B

Qwen/Qwen3.5-122B-A10B

Multimodal Text Image
122B 1,019,315 533

unsloth/Qwen3.5-9B-GGUF

unsloth/Qwen3.5-9B-GGUF

Multimodal Text Image
9B 1,017,755 576

nvidia/bigvgan_v2_44khz_128band_512x

nvidia/bigvgan_v2_44khz_128band_512x

audio-to-audio
1,007,080 69

unsloth/Qwen3.5-35B-A3B-GGUF

unsloth/Qwen3.5-35B-A3B-GGUF

Multimodal Text Image
35B 1,001,497 833

microsoft/deberta-v3-large

microsoft/deberta-v3-large

fill-mask
995,542 276

unsloth/Qwen3.6-27B-GGUF

unsloth/Qwen3.6-27B-GGUF

Multimodal Text Image
27B 983,535 543

Qwen/Qwen3-Reranker-4B

Qwen/Qwen3-Reranker-4B

text-ranking
4B 980,076 132

google/gemma-4-E4B

google/gemma-4-E4B

any-to-any
4B 974,042 221

Qwen/Qwen3-Coder-Next

Qwen/Qwen3-Coder-Next

Code Text
971,785 1,340

microsoft/Florence-2-base

microsoft/Florence-2-base

Multimodal Text Image
947,196 366

microsoft/tapex-base-finetuned-wikisql

microsoft/tapex-base-finetuned-wikisql

table-question-answering
941,819 24

google/owlv2-base-patch16-ensemble

google/owlv2-base-patch16-ensemble

zero-shot-object-detection
936,075 119

BAAI/bge-large-zh-v1.5

BAAI/bge-large-zh-v1.5

feature-extraction
933,058 622

Qwen/Qwen3.5-35B-A3B-GPTQ-Int4

Qwen/Qwen3.5-35B-A3B-GPTQ-Int4

Multimodal Text Image
35B 911,088 80

crynux-network/sdxl-turbo

crynux-network/sdxl-turbo

Image
907,330 3

Qwen/Qwen3-VL-32B-Instruct-FP8

Qwen/Qwen3-VL-32B-Instruct-FP8

Multimodal Text Image
32B 890,370 45

microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

zero-shot-image-classification
881,020 399

stabilityai/sdxl-turbo

stabilityai/sdxl-turbo

Image
874,902 2,567

Qwen/Qwen2.5-Coder-14B-Instruct-AWQ

Qwen/Qwen2.5-Coder-14B-Instruct-AWQ

Code Text
14B 840,207 17

lightonai/LightOnOCR-2-1B

lightonai/LightOnOCR-2-1B

Multimodal Text Image
1B 827,042 676

microsoft/Florence-2-large

microsoft/Florence-2-large

Multimodal Text Image
782,673 1,802

crynux-network/stable-diffusion-v1-5

crynux-network/stable-diffusion-v1-5

Image
776,624 1

nvidia/llama-nemotron-embed-1b-v2

nvidia/llama-nemotron-embed-1b-v2

feature-extraction
1B 763,762 55

Qwen/Qwen2.5-Omni-3B

Qwen/Qwen2.5-Omni-3B

any-to-any
3B 761,760 334

black-forest-labs/FLUX.1-dev

black-forest-labs/FLUX.1-dev

Image
741,117 12,741

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF

Multimodal Text Image
27B 740,309 646

microsoft/layoutlmv3-base

microsoft/layoutlmv3-base

other
735,692 482

black-forest-labs/FLUX.1-schnell

black-forest-labs/FLUX.1-schnell

Image
724,196 4,838

microsoft/Phi-3-mini-4k-instruct

microsoft/Phi-3-mini-4k-instruct

Code Text
723,680 1,415

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Code Text
7B 720,606 13

microsoft/Phi-3.5-mini-instruct

microsoft/Phi-3.5-mini-instruct

Code Text
720,381 975

allenai/unifiedqa-t5-small

allenai/unifiedqa-t5-small

other
718,925 5

HuggingFaceTB/SmolVLM-256M-Instruct

HuggingFaceTB/SmolVLM-256M-Instruct

Multimodal Text Image
707,083 357

Qwen/Qwen3-Omni-30B-A3B-Instruct

Qwen/Qwen3-Omni-30B-A3B-Instruct

any-to-any
30B 702,058 919

nvidia/parakeet-ctc-1.1b

nvidia/parakeet-ctc-1.1b

automatic-speech-recognition
1.1B 701,259 46

moonshotai/Kimi-K2.6

moonshotai/Kimi-K2.6

Multimodal Text Image
699,348 1,182

microsoft/deberta-v3-small

microsoft/deberta-v3-small

fill-mask
696,306 76

Qwen/Qwen3-TTS-12Hz-0.6B-Base

Qwen/Qwen3-TTS-12Hz-0.6B-Base

text-to-speech
0.6B 685,035 232

microsoft/VibeVoice-ASR

microsoft/VibeVoice-ASR

automatic-speech-recognition
682,305 1,107

Qwen/Qwen3-VL-30B-A3B-Instruct

Qwen/Qwen3-VL-30B-A3B-Instruct

Multimodal Text Image
30B 681,180 567

microsoft/trocr-base-printed

microsoft/trocr-base-printed

image-to-text
680,018 206

mistralai/Mixtral-8x7B-Instruct-v0.1

mistralai/Mixtral-8x7B-Instruct-v0.1

other
7B 679,884 4,674

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Multimodal Text Image
27B 677,638 119

codgician/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4

codgician/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GPTQ-int4

Multimodal Text Image
35B 669,051 9

pytorch/gemma-3-27b-it-AWQ-INT4

pytorch/gemma-3-27b-it-AWQ-INT4

Multimodal Text Image
27B 664,173 7

microsoft/phi-4

microsoft/phi-4

Code Text
14B 657,336 2,238

mlx-community/gemma-3-4b-it-qat-4bit

mlx-community/gemma-3-4b-it-qat-4bit

Multimodal Text Image
4B 647,826 8

google/siglip2-so400m-patch14-384

google/siglip2-so400m-patch14-384

zero-shot-image-classification
634,291 80

microsoft/layoutlmv2-base-uncased

microsoft/layoutlmv2-base-uncased

other
634,019 67

CompVis/stable-diffusion-v1-4

CompVis/stable-diffusion-v1-4

Image
620,505 7,004

microsoft/llmlingua-2-xlm-roberta-large-meetingbank

microsoft/llmlingua-2-xlm-roberta-large-meetingbank

token-classification
617,939 28

stabilityai/sd-turbo

stabilityai/sd-turbo

Image
614,172 447

Qwen/Qwen2.5-Coder-1.5B

Qwen/Qwen2.5-Coder-1.5B

Code Text
1.5B 612,131 91

google/siglip2-base-patch16-224

google/siglip2-base-patch16-224

zero-shot-image-classification
610,555 96

microsoft/deberta-large-mnli

microsoft/deberta-large-mnli

text-classification
609,234 31

mistralai/Voxtral-Mini-3B-2507

mistralai/Voxtral-Mini-3B-2507

other
3B 604,616 644

microsoft/wavlm-base-plus

microsoft/wavlm-base-plus

feature-extraction
593,203 36

microsoft/wavlm-large

microsoft/wavlm-large

feature-extraction
578,173 105

google/flan-t5-small

google/flan-t5-small

other
573,015 477

ricdomolm/mini-coder-1.7b

ricdomolm/mini-coder-1.7b

Code Text
1.7B 569,617 2

google/flan-t5-large

google/flan-t5-large

other
561,209 882

mistralai/Mistral-Small-3.1-24B-Instruct-2503

mistralai/Mistral-Small-3.1-24B-Instruct-2503

other
24B 547,039 1,357

microsoft/resnet-18

microsoft/resnet-18

image-classification
540,428 66

Qwen/Qwen2.5-Omni-7B

Qwen/Qwen2.5-Omni-7B

any-to-any
7B 530,843 1,892

nvidia/llama-nemotron-rerank-1b-v2

nvidia/llama-nemotron-rerank-1b-v2

text-ranking
1B 527,253 49

Qwen/Qwen3-ASR-0.6B

Qwen/Qwen3-ASR-0.6B

automatic-speech-recognition
0.6B 521,881 284

bartowski/Qwen2.5-Coder-7B-Instruct-GGUF

bartowski/Qwen2.5-Coder-7B-Instruct-GGUF

Code Text
7B 519,429 46

nvidia/personaplex-7b-v1

nvidia/personaplex-7b-v1

audio-to-audio
7B 515,566 2,477

Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

text-to-speech
1.7B 512,247 335

google/madlad400-3b-mt

google/madlad400-3b-mt

translation
3B 475,851 193

allenai/specter2_base

allenai/specter2_base

feature-extraction
475,752 44

google/siglip2-base-patch16-naflex

google/siglip2-base-patch16-naflex

zero-shot-image-classification
466,635 27

TechxGenus/DeepSeek-Coder-V2-Lite-Instruct-AWQ

TechxGenus/DeepSeek-Coder-V2-Lite-Instruct-AWQ

Code Text
463,886 9

Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8

Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8

Code Text
30B 462,439 180

google/bert_uncased_L-2_H-128_A-2

google/bert_uncased_L-2_H-128_A-2

other
459,293 34

nvidia/speakerverification_en_titanet_large

nvidia/speakerverification_en_titanet_large

other
455,257 120

frankjoshua/novaAnimeXL_ilV140

frankjoshua/novaAnimeXL_ilV140

Image
452,434 2

microsoft/deberta-xlarge-mnli

microsoft/deberta-xlarge-mnli

text-classification
450,891 23

NexVeridian/Qwen3-Coder-Next-8bit

NexVeridian/Qwen3-Coder-Next-8bit

Code Text
446,609 3

allenai/scibert_scivocab_uncased

allenai/scibert_scivocab_uncased

other
442,481 172

Qwen/Qwen2.5-Coder-3B-Instruct

Qwen/Qwen2.5-Coder-3B-Instruct

Code Text
3B 438,497 104

QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

Code Text
30B 432,910 6

BAAI/bge-base-zh-v1.5

BAAI/bge-base-zh-v1.5

feature-extraction
432,611 105

mistralai/Mistral-Nemo-Instruct-2407

mistralai/Mistral-Nemo-Instruct-2407

other
427,481 1,667

microsoft/swinv2-tiny-patch4-window16-256

microsoft/swinv2-tiny-patch4-window16-256

image-classification
422,392 13

nvidia/canary-1b-flash

nvidia/canary-1b-flash

automatic-speech-recognition
1B 421,671 271

BAAI/bge-small-en

BAAI/bge-small-en

feature-extraction
417,222 92

microsoft/Phi-4-multimodal-instruct

microsoft/Phi-4-multimodal-instruct

automatic-speech-recognition
416,306 1,597

nvidia/parakeet-tdt-0.6b-v3

nvidia/parakeet-tdt-0.6b-v3

automatic-speech-recognition
0.6B 402,965 821

microsoft/unispeech-sat-large-sv

microsoft/unispeech-sat-large-sv

other
398,250 5

Qwen/Qwen3-Coder-Next-FP8

Qwen/Qwen3-Coder-Next-FP8

Code Text
383,125 140

BAAI/bge-base-zh

BAAI/bge-base-zh

feature-extraction
382,793 58

google/siglip2-so400m-patch16-naflex

google/siglip2-so400m-patch16-naflex

zero-shot-image-classification
373,892 66

bigscience/bloom-560m

bigscience/bloom-560m

Code Text
373,325 371

microsoft/markuplm-base

microsoft/markuplm-base

other
363,404 27

mistralai/Devstral-Small-2-24B-Instruct-2512

mistralai/Devstral-Small-2-24B-Instruct-2512

other
24B 352,598 595

stabilityai/stable-video-diffusion-img2vid-xt

stabilityai/stable-video-diffusion-img2vid-xt

image-to-video
339,703 3,286

microsoft/graphcodebert-base

microsoft/graphcodebert-base

fill-mask
335,683 87

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

Code Text
7B 328,809 22

lightx2v/Qwen-Image-Lightning

lightx2v/Qwen-Image-Lightning

Image
327,382 793

microsoft/resnet-50

microsoft/resnet-50

image-classification
317,641 492

nvidia/segformer-b0-finetuned-ade-512-512

nvidia/segformer-b0-finetuned-ade-512-512

image-segmentation
314,803 185

microsoft/VibeVoice-ASR-HF

microsoft/VibeVoice-ASR-HF

audio-text-to-text
312,971 118

stabilityai/sdxl-vae

stabilityai/sdxl-vae

other
308,472 741

John6666/diving-illustrious-real-asian-v50-sdxl

John6666/diving-illustrious-real-asian-v50-sdxl

Image
301,039 0

John6666/one-obsession-17-red-sdxl

John6666/one-obsession-17-red-sdxl

Image
300,926 3

microsoft/codebert-base

microsoft/codebert-base

feature-extraction
299,840 285

google/siglip2-base-patch16-512

google/siglip2-base-patch16-512

zero-shot-image-classification
291,075 39

diffusers/stable-diffusion-xl-1.0-inpainting-0.1

diffusers/stable-diffusion-xl-1.0-inpainting-0.1

Image
289,628 370

meta-llama/Prompt-Guard-86M

meta-llama/Prompt-Guard-86M

text-classification
280,111 325

microsoft/beit-base-patch16-224

microsoft/beit-base-patch16-224

image-classification
279,556 9

stabilityai/stable-diffusion-3.5-medium

stabilityai/stable-diffusion-3.5-medium

Image
274,413 929

List-cloud/List-3.0-Ultra-Coder-Brain

List-cloud/List-3.0-Ultra-Coder-Brain

Code Text
228B 259,460 9

microsoft/VibeVoice-1.5B

microsoft/VibeVoice-1.5B

text-to-speech
1.5B 257,404 2,353

mistralai/Ministral-3-3B-Instruct-2512

mistralai/Ministral-3-3B-Instruct-2512

other
3B 257,157 227

google/gemma-4-E2B

google/gemma-4-E2B

any-to-any
2B 250,993 242

microsoft/Phi-3-mini-128k-instruct

microsoft/Phi-3-mini-128k-instruct

Code Text
244,726 1,699

google/pegasus-xsum

google/pegasus-xsum

summarization
244,160 219

playgroundai/playground-v2.5-1024px-aesthetic

playgroundai/playground-v2.5-1024px-aesthetic

Image
239,477 763

google/timesfm-2.5-200m-transformers

google/timesfm-2.5-200m-transformers

time-series-forecasting
236,555 75

stabilityai/stable-diffusion-xl-refiner-1.0

stabilityai/stable-diffusion-xl-refiner-1.0

image-to-image
233,405 2,038

google/siglip2-so400m-patch16-384

google/siglip2-so400m-patch16-384

zero-shot-image-classification
230,366 4

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext

fill-mask
230,171 323

microsoft/BiomedVLP-CXR-BERT-specialized

microsoft/BiomedVLP-CXR-BERT-specialized

fill-mask
223,803 36

deepseek-ai/deepseek-coder-6.7b-instruct

deepseek-ai/deepseek-coder-6.7b-instruct

Code Text
6.7B 223,342 492

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-4bit

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-4bit

Code Text
30B 223,107 21

Qwen/Qwen2.5-Coder-0.5B-Instruct

Qwen/Qwen2.5-Coder-0.5B-Instruct

Code Text
0.5B 218,993 66

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit

Code Text
30B 214,174 7

unsloth/Qwen3-Coder-Next-GGUF

unsloth/Qwen3-Coder-Next-GGUF

Code Text
213,166 610

ByteDance/SDXL-Lightning

ByteDance/SDXL-Lightning

Image
208,302 2,145

John6666/nova-furry-xl-il-v120-sdxl

John6666/nova-furry-xl-il-v120-sdxl

Image
205,804 5

stabilityai/TripoSR

stabilityai/TripoSR

image-to-3d
205,086 610

cagliostrolab/animagine-xl-4.0

cagliostrolab/animagine-xl-4.0

Image
204,977 413

Qwen/Qwen2.5-Coder-1.5B-Instruct

Qwen/Qwen2.5-Coder-1.5B-Instruct

Code Text
1.5B 203,544 118

stable-diffusion-v1-5/stable-diffusion-inpainting

stable-diffusion-v1-5/stable-diffusion-inpainting

Image
203,215 105

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit

Code Text
30B 201,610 15

mistralai/Ministral-3-14B-Instruct-2512

mistralai/Ministral-3-14B-Instruct-2512

other
14B 201,199 281

mistralai/Mistral-7B-v0.3

mistralai/Mistral-7B-v0.3

other
7B 201,096 575

casperhansen/deepseek-coder-v2-instruct-awq

casperhansen/deepseek-coder-v2-instruct-awq

Code Text
200,912 11

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bit

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bit

Code Text
30B 198,339 5

h94/IP-Adapter-FaceID

h94/IP-Adapter-FaceID

Image
196,158 1,842

google/siglip2-so400m-patch14-224

google/siglip2-so400m-patch14-224

zero-shot-image-classification
193,872 3

John6666/prefect-illustrious-xl-v3-sdxl

John6666/prefect-illustrious-xl-v3-sdxl

Image
193,683 0

microsoft/harrier-oss-v1-0.6b

microsoft/harrier-oss-v1-0.6b

feature-extraction
0.6B 192,271 211

codellama/CodeLlama-7b-hf

codellama/CodeLlama-7b-hf

Code Text
7B 189,706 376

microsoft/layoutlm-base-uncased

microsoft/layoutlm-base-uncased

other
188,366 62

nvidia/Llama-4-Scout-17B-16E-Instruct-FP8

nvidia/Llama-4-Scout-17B-16E-Instruct-FP8

other
109B 185,913 15

nvidia/segformer-b1-finetuned-ade-512-512

nvidia/segformer-b1-finetuned-ade-512-512

image-segmentation
182,797 15

optimum-intel-internal-testing/tiny-stable-diffusion-torch

optimum-intel-internal-testing/tiny-stable-diffusion-torch

Image
182,391 0

nvidia/parakeet-tdt-0.6b-v2

nvidia/parakeet-tdt-0.6b-v2

automatic-speech-recognition
0.6B 180,157 1,467

google/owlvit-base-patch32

google/owlvit-base-patch32

zero-shot-object-detection
180,032 148

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4

any-to-any
30B 180,012 69

nvidia/audio-flamingo-3-hf

nvidia/audio-flamingo-3-hf

audio-text-to-text
179,072 181

bigcode/tiny_starcoder_py

bigcode/tiny_starcoder_py

Code Text
179,036 74

microsoft/trocr-large-handwritten

microsoft/trocr-large-handwritten

image-to-text
178,706 160

mistralai/Ministral-8B-Instruct-2410

mistralai/Ministral-8B-Instruct-2410

other
8B 176,033 577

optimum-intel-internal-testing/tiny-random-stable-diffusion-xl

optimum-intel-internal-testing/tiny-random-stable-diffusion-xl

Image
173,779 0

nvidia/canary-1b-v2

nvidia/canary-1b-v2

automatic-speech-recognition
1B 173,234 379

microsoft/xclip-base-patch32

microsoft/xclip-base-patch32

video-classification
168,365 109

microsoft/kosmos-2-patch14-224

microsoft/kosmos-2-patch14-224

image-to-text
167,569 184

mistralai/Mixtral-8x7B-v0.1

mistralai/Mixtral-8x7B-v0.1

other
7B 167,480 1,806

microsoft/speecht5_tts

microsoft/speecht5_tts

text-to-speech
162,673 826

google/bigbird-roberta-base

google/bigbird-roberta-base

other
161,995 62

microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank

microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank

token-classification
161,873 50

cagliostrolab/animagine-xl-3.1

cagliostrolab/animagine-xl-3.1

Image
161,771 714

John6666/obsession-illustriousxl-v10-sdxl

John6666/obsession-illustriousxl-v10-sdxl

Image
161,070 1

microsoft/Phi-3-vision-128k-instruct

microsoft/Phi-3-vision-128k-instruct

Code Text Multimodal
160,499 971

unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF

unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF

Code Text
30B 159,918 617

unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit

unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit

Code Text
7B 158,085 12

nvidia/llama-nemotron-rerank-vl-1b-v2

nvidia/llama-nemotron-rerank-vl-1b-v2

text-ranking
1B 157,064 33

microsoft/trocr-base-handwritten

microsoft/trocr-base-handwritten

image-to-text
152,867 493

microsoft/unixcoder-base

microsoft/unixcoder-base

feature-extraction
152,441 68

microsoft/harrier-oss-v1-270m

microsoft/harrier-oss-v1-270m

feature-extraction
151,609 156

nvidia/Llama-3.3-70B-Instruct-NVFP4

nvidia/Llama-3.3-70B-Instruct-NVFP4

other
70B 150,609 42

ggml-org/Qwen3-Coder-30B-A3B-Instruct-Q8_0-GGUF

ggml-org/Qwen3-Coder-30B-A3B-Instruct-Q8_0-GGUF

Code Text
30B 150,295 9

lmstudio-community/Qwen2.5-Coder-14B-Instruct-MLX-4bit

lmstudio-community/Qwen2.5-Coder-14B-Instruct-MLX-4bit

Code Text
14B 149,563 4

optimum-intel-internal-testing/stable-diffusion-3-tiny-random

optimum-intel-internal-testing/stable-diffusion-3-tiny-random

Image
146,050 0

microsoft/deberta-v2-xlarge

microsoft/deberta-v2-xlarge

fill-mask
144,782 23

google/siglip-large-patch16-384

google/siglip-large-patch16-384

zero-shot-image-classification
144,236 11

John6666/amanatsu-illustrious-v11-sdxl

John6666/amanatsu-illustrious-v11-sdxl

Image
143,916 3

microsoft/wavlm-base-plus-sv

microsoft/wavlm-base-plus-sv

other
137,237 54

Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Video
1.3B 136,054 123

city96/FLUX.1-dev-gguf

city96/FLUX.1-dev-gguf

Image
135,156 1,311

stabilityai/sd-x2-latent-upscaler

stabilityai/sd-x2-latent-upscaler

other
134,657 189

nvidia/Llama-3.1-8B-Instruct-NVFP4

nvidia/Llama-3.1-8B-Instruct-NVFP4

other
8B 132,981 10

microsoft/trocr-large-printed

microsoft/trocr-large-printed

image-to-text
132,828 179

cyankiwi/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit

cyankiwi/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit

Code Text
30B 130,019 49

google/siglip2-base-patch16-256

google/siglip2-base-patch16-256

zero-shot-image-classification
129,374 8

Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ

Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ

Code Text
1.5B 128,520 4

optimum-intel-internal-testing/tiny-random-flux

optimum-intel-internal-testing/tiny-random-flux

Image
127,812 0

mistralai/Ministral-3-8B-Reasoning-2512-GGUF

mistralai/Ministral-3-8B-Reasoning-2512-GGUF

other
8B 126,816 28

BAAI/bge-base-en

BAAI/bge-base-en

feature-extraction
126,531 61

BAAI/bge-reranker-v2.5-gemma2-lightweight

BAAI/bge-reranker-v2.5-gemma2-lightweight

text-classification
125,879 53

meta-llama/Llama-Prompt-Guard-2-86M

meta-llama/Llama-Prompt-Guard-2-86M

text-classification
124,417 112

microsoft/infoxlm-large

microsoft/infoxlm-large

fill-mask
123,487 14

BSC-LT/salamandra-7b-instruct

BSC-LT/salamandra-7b-instruct

Code Text
7B 122,950 79

lmstudio-community/Qwen2.5-Coder-14B-Instruct-MLX-8bit

lmstudio-community/Qwen2.5-Coder-14B-Instruct-MLX-8bit

Code Text
14B 122,809 2

stabilityai/sd-vae-ft-mse

stabilityai/sd-vae-ft-mse

other
122,180 407

Tencent-Hunyuan/HunyuanDiT-v1.1-Diffusers-Distilled

Tencent-Hunyuan/HunyuanDiT-v1.1-Diffusers-Distilled

Image
121,866 15

Qwen/Qwen2.5-Coder-7B-Instruct-GGUF

Qwen/Qwen2.5-Coder-7B-Instruct-GGUF

Code Text
7B 120,311 238

microsoft/deberta-base

microsoft/deberta-base

fill-mask
119,482 85

microsoft/wavlm-base-plus-sd

microsoft/wavlm-base-plus-sd

other
118,647 12

RunDiffusion/Juggernaut-XL-v9

RunDiffusion/Juggernaut-XL-v9

Image
117,843 326

BAAI/bge-small-zh

BAAI/bge-small-zh

feature-extraction
116,034 27

RedHatAI/DeepSeek-Coder-V2-Lite-Instruct-FP8

RedHatAI/DeepSeek-Coder-V2-Lite-Instruct-FP8

Code Text
114,714 11

nvidia/llama-nemotron-embed-vl-1b-v2

nvidia/llama-nemotron-embed-vl-1b-v2

sentence-similarity
1B 113,731 63

Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8

Code Text
480B 113,456 151

mistralai/Mistral-Small-24B-Instruct-2501

mistralai/Mistral-Small-24B-Instruct-2501

other
24B 112,422 950

nvidia/music-flamingo-2601-hf

nvidia/music-flamingo-2601-hf

audio-text-to-text
112,090 98

microsoft/Phi-3.5-MoE-instruct

microsoft/Phi-3.5-MoE-instruct

Code Text
111,882 574

nvidia/canary-qwen-2.5b

nvidia/canary-qwen-2.5b

automatic-speech-recognition
2.5B 111,868 420

cagliostrolab/animagine-xl-3.0

cagliostrolab/animagine-xl-3.0

Image
110,466 777

microsoft/mpnet-base

microsoft/mpnet-base

fill-mask
109,390 50

mistralai/Ministral-3-8B-Instruct-2512

mistralai/Ministral-3-8B-Instruct-2512

other
8B 109,361 166

optimum-intel-internal-testing/tiny-random-latent-consistency

optimum-intel-internal-testing/tiny-random-latent-consistency

Image
109,087 0

BAAI/bge-large-en

BAAI/bge-large-en

feature-extraction
107,934 224

nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0

nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0

automatic-speech-recognition
107,765 37

cyankiwi/Qwen3-Coder-Next-AWQ-4bit

cyankiwi/Qwen3-Coder-Next-AWQ-4bit

Code Text
107,630 28

Salesforce/codegen-350M-mono

Salesforce/codegen-350M-mono

Code Text
106,476 101

Laxhar/noobai-XL-1.1

Laxhar/noobai-XL-1.1

Image
105,860 128

lightx2v/Qwen-Image-2512-Lightning

lightx2v/Qwen-Image-2512-Lightning

Image
105,329 207

microsoft/deberta-base-mnli

microsoft/deberta-base-mnli

text-classification
105,099 8

nvidia/segformer-b4-finetuned-ade-512-512

nvidia/segformer-b4-finetuned-ade-512-512

image-segmentation
104,703 4

google/siglip2-giant-opt-patch16-384

google/siglip2-giant-opt-patch16-384

zero-shot-image-classification
102,189 41

xinsir/controlnet-union-sdxl-1.0

xinsir/controlnet-union-sdxl-1.0

Image
100,897 1,727

SimianLuo/LCM_Dreamshaper_v7

SimianLuo/LCM_Dreamshaper_v7

Image
100,476 415

google/siglip2-so400m-patch16-256

google/siglip2-so400m-patch16-256

zero-shot-image-classification
100,148 1

microsoft/swin-base-patch4-window12-384-in22k

microsoft/swin-base-patch4-window12-384-in22k

image-classification
98,988 3

google/siglip2-base-patch16-384

google/siglip2-base-patch16-384

zero-shot-image-classification
98,466 9

Wan-AI/Wan2.2-TI2V-5B-Diffusers

Wan-AI/Wan2.2-TI2V-5B-Diffusers

Video
5B 98,266 132

John6666/hassaku-xl-illustrious-v31-sdxl

John6666/hassaku-xl-illustrious-v31-sdxl

Image
96,947 1

SG161222/Realistic_Vision_V5.1_noVAE

SG161222/Realistic_Vision_V5.1_noVAE

Image
95,128 244

bigcode/starcoder2-3b

bigcode/starcoder2-3b

Code Text
3B 91,185 218

T5B/Z-Image-Turbo-FP8

T5B/Z-Image-Turbo-FP8

Image
5B 91,173 163

BAAI/AltCLIP

BAAI/AltCLIP

zero-shot-image-classification
90,055 32

microsoft/speecht5_asr

microsoft/speecht5_asr

automatic-speech-recognition
88,346 43

microsoft/prophetnet-large-uncased

microsoft/prophetnet-large-uncased

other
88,342 6

Wan-AI/Wan2.2-T2V-A14B-Diffusers

Wan-AI/Wan2.2-T2V-A14B-Diffusers

Video
14B 87,712 131

XLabs-AI/xflux_text_encoders

XLabs-AI/xflux_text_encoders

Code Text
87,644 21

microsoft/codebert-base-mlm

microsoft/codebert-base-mlm

fill-mask
87,123 47

google/bert_for_seq_generation_L-24_bbc_encoder

google/bert_for_seq_generation_L-24_bbc_encoder

other
86,660 1

OnomaAIResearch/Illustrious-xl-early-release-v0

OnomaAIResearch/Illustrious-xl-early-release-v0

Image
85,657 424

Command A (111B)

CohereLabs/c4ai-command-a-03-2025

text-generation
111B 85,000 420

google/siglip2-large-patch16-256

google/siglip2-large-patch16-256

zero-shot-image-classification
82,732 6

codellama/CodeLlama-7b-Instruct-hf

codellama/CodeLlama-7b-Instruct-hf

Code Text
7B 80,710 255

microsoft/speecht5_hifigan

microsoft/speecht5_hifigan

other
79,187 23

nvidia/Alpamayo-1.5-10B

nvidia/Alpamayo-1.5-10B

robotics
10B 79,160 60

SG161222/RealVisXL_V5.0

SG161222/RealVisXL_V5.0

Image
78,974 151

stabilityai/stable-diffusion-3-medium-diffusers

stabilityai/stable-diffusion-3-medium-diffusers

Image
78,858 445

microsoft/layoutlmv3-large

microsoft/layoutlmv3-large

other
77,123 126

Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF

Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF

Code Text
1.5B 76,139 48

ali-vilab/text-to-video-ms-1.7b

ali-vilab/text-to-video-ms-1.7b

Video
1.7B 75,375 658

janhq/Jan-v3-4B-base-instruct-gguf

janhq/Jan-v3-4B-base-instruct-gguf

Code Text
4B 74,296 52

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8

any-to-any
30B 73,464 38

google/siglip2-so400m-patch16-512

google/siglip2-so400m-patch16-512

zero-shot-image-classification
71,880 44

unsloth/ERNIE-Image-Turbo-GGUF

unsloth/ERNIE-Image-Turbo-GGUF

Image
70,920 200

nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4

nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4

other
109B 70,814 31

optimum-intel-internal-testing/tiny-random-sana

optimum-intel-internal-testing/tiny-random-sana

Image
67,596 0

QuantStack/Wan2.2-T2V-A14B-GGUF

QuantStack/Wan2.2-T2V-A14B-GGUF

Video
14B 66,702 251

city96/FLUX.1-schnell-gguf

city96/FLUX.1-schnell-gguf

Image
65,896 320

mistralai/Devstral-Small-2505

mistralai/Devstral-Small-2505

other
24B 65,698 869

ostris/zimage_turbo_training_adapter

ostris/zimage_turbo_training_adapter

Image
65,687 133

nvidia/nemotron-colembed-vl-4b-v2

nvidia/nemotron-colembed-vl-4b-v2

visual-document-retrieval
4B 65,497 36

google/vit-base-patch16-384

google/vit-base-patch16-384

image-classification
65,020 50

google/vit-large-patch16-224-in21k

google/vit-large-patch16-224-in21k

image-feature-extraction
63,719 30

nvidia/segformer-b2-finetuned-ade-512-512

nvidia/segformer-b2-finetuned-ade-512-512

image-segmentation
63,684 6

nvidia/segformer-b5-finetuned-ade-640-640

nvidia/segformer-b5-finetuned-ade-640-640

image-segmentation
62,542 44

nvidia/diar_streaming_sortformer_4spk-v2.1

nvidia/diar_streaming_sortformer_4spk-v2.1

automatic-speech-recognition
60,900 66

microsoft/harrier-oss-v1-27b

microsoft/harrier-oss-v1-27b

feature-extraction
27B 60,656 122

mistralai/Mistral-Small-4-119B-2603

mistralai/Mistral-Small-4-119B-2603

other
119B 59,040 371

unsloth/Qwen-Image-2512-GGUF

unsloth/Qwen-Image-2512-GGUF

Image
58,387 349

mistralai/Voxtral-Small-24B-2507

mistralai/Voxtral-Small-24B-2507

audio-text-to-text
24B 57,560 489

microsoft/trocr-small-handwritten

microsoft/trocr-small-handwritten

image-to-text
57,512 63

Wan-AI/Wan2.1-T2V-14B

Wan-AI/Wan2.1-T2V-14B

Video
14B 55,179 1,493

google/tapas-large-finetuned-sqa

google/tapas-large-finetuned-sqa

table-question-answering
54,811 7

moonshotai/Kimi-Audio-7B-Instruct

moonshotai/Kimi-Audio-7B-Instruct

text-to-speech
7B 54,422 396

microsoft/rad-dino

microsoft/rad-dino

image-feature-extraction
54,290 73

HiDream-ai/HiDream-I1-Fast

HiDream-ai/HiDream-I1-Fast

Image
54,164 104

stabilityai/stable-diffusion-3.5-large

stabilityai/stable-diffusion-3.5-large

Image
52,564 3,454

Manojb/stable-diffusion-2-1-base

Manojb/stable-diffusion-2-1-base

Image
51,787 45

microsoft/deberta-v3-xsmall

microsoft/deberta-v3-xsmall

fill-mask
51,174 48

google/mobilenet_v2_1.0_224

google/mobilenet_v2_1.0_224

image-classification
50,768 42

google/siglip2-large-patch16-384

google/siglip2-large-patch16-384

zero-shot-image-classification
50,328 2

stabilityai/stable-video-diffusion-img2vid

stabilityai/stable-video-diffusion-img2vid

image-to-video
49,521 1,027

microsoft/wavlm-base

microsoft/wavlm-base

feature-extraction
48,356 11

nvidia/Cosmos-Predict2-2B-Video2World

nvidia/Cosmos-Predict2-2B-Video2World

image-to-video
2B 46,300 55

google/medsiglip-448

google/medsiglip-448

zero-shot-image-classification
45,906 137

google/owlv2-large-patch14-ensemble

google/owlv2-large-patch14-ensemble

zero-shot-object-detection
45,271 37

FastVideo/FastWan2.2-TI2V-5B-FullAttn-Diffusers

FastVideo/FastWan2.2-TI2V-5B-FullAttn-Diffusers

Video
5B 44,972 63

nvidia/Alpamayo-R1-10B

nvidia/Alpamayo-R1-10B

robotics
10B 44,589 393

Wan-AI/Wan2.1-T2V-14B-Diffusers

Wan-AI/Wan2.1-T2V-14B-Diffusers

Video
14B 44,306 50

microsoft/swin-base-patch4-window7-224

microsoft/swin-base-patch4-window7-224

image-classification
44,019 26

google/muril-base-cased

google/muril-base-cased

fill-mask
43,765 58

microsoft/BiomedVLP-CXR-BERT-general

microsoft/BiomedVLP-CXR-BERT-general

fill-mask
43,326 45

Command A Reasoning (111B)

CohereLabs/command-a-reasoning-08-2025

text-generation
111B 42,000 210

google/ddpm-cifar10-32

google/ddpm-cifar10-32

unconditional-image-generation
40,885 85

alibaba-pai/Wan2.2-Fun-Reward-LoRAs

alibaba-pai/Wan2.2-Fun-Reward-LoRAs

Video
14B 40,276 68

nvidia/llama-embed-nemotron-8b

nvidia/llama-embed-nemotron-8b

feature-extraction
8B 39,631 160

deepseek-ai/Janus-Pro-7B

deepseek-ai/Janus-Pro-7B

any-to-any
7B 38,771 3,593

Lightricks/LTX-Video-ICLoRA-detailer-13b-0.9.8

Lightricks/LTX-Video-ICLoRA-detailer-13b-0.9.8

Video
13B 38,713 30

zai-org/CogVideoX-5b

zai-org/CogVideoX-5b

Video
5B 38,572 672

magespace/Wan2.2-I2V-A14B-Lightning-Diffusers

magespace/Wan2.2-I2V-A14B-Lightning-Diffusers

Video
14B 37,975 2

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16

any-to-any
30B 37,418 198

microsoft/Multilingual-MiniLM-L12-H384

microsoft/Multilingual-MiniLM-L12-H384

text-classification
34,873 100

microsoft/infoxlm-base

microsoft/infoxlm-base

fill-mask
34,830 8

google/timesfm-2.0-500m-pytorch

google/timesfm-2.0-500m-pytorch

time-series-forecasting
34,309 252

microsoft/MiniLM-L12-H384-uncased

microsoft/MiniLM-L12-H384-uncased

text-classification
34,047 108

BAAI/llm-embedder

BAAI/llm-embedder

feature-extraction
33,550 128

microsoft/trocr-small-printed

microsoft/trocr-small-printed

image-to-text
32,589 48

nvidia/parakeet-ctc-0.6b

nvidia/parakeet-ctc-0.6b

automatic-speech-recognition
0.6B 32,499 26

nvidia/NV-Embed-v2

nvidia/NV-Embed-v2

feature-extraction
31,511 509

nvidia/GR00T-N1.6-3B

nvidia/GR00T-N1.6-3B

robotics
3B 31,077 87

BAAI/bge-reranker-v2-gemma

BAAI/bge-reranker-v2-gemma

text-classification
30,854 84

nvidia/segformer-b5-finetuned-cityscapes-1024-1024

nvidia/segformer-b5-finetuned-cityscapes-1024-1024

image-segmentation
30,033 43

microsoft/swin-tiny-patch4-window7-224

microsoft/swin-tiny-patch4-window7-224

image-classification
27,496 52

microsoft/rad-dino-maira-2

microsoft/rad-dino-maira-2

image-feature-extraction
24,821 24

nvidia/diar_streaming_sortformer_4spk-v2

nvidia/diar_streaming_sortformer_4spk-v2

automatic-speech-recognition
23,894 116

stabilityai/stable-virtual-camera

stabilityai/stable-virtual-camera

image-to-video
22,863 229

tiiuae/Falcon-OCR

tiiuae/Falcon-OCR

image-to-text
22,062 90

city96/Wan2.1-T2V-14B-gguf

city96/Wan2.1-T2V-14B-gguf

Video
14B 21,886 191

stabilityai/stable-audio-open-1.0

stabilityai/stable-audio-open-1.0

text-to-audio
21,867 1,452

nvidia/segformer-b3-finetuned-ade-512-512

nvidia/segformer-b3-finetuned-ade-512-512

image-segmentation
21,735 14

zai-org/CogVideoX-2b

zai-org/CogVideoX-2b

Video
2B 21,534 362

bullerwins/Wan2.2-T2V-A14B-GGUF

bullerwins/Wan2.2-T2V-A14B-GGUF

Video
14B 21,010 69

BAAI/bge-large-zh

BAAI/bge-large-zh

feature-extraction
19,271 344

Wan-AI/Wan2.1-T2V-1.3B

Wan-AI/Wan2.1-T2V-1.3B

Video
1.3B 18,926 448

tiiuae/Falcon-Perception

tiiuae/Falcon-Perception

mask-generation
18,741 118

QuantStack/Wan2.2-TI2V-5B-GGUF

QuantStack/Wan2.2-TI2V-5B-GGUF

Video
5B 17,978 173

deepseek-ai/Janus-Pro-1B

deepseek-ai/Janus-Pro-1B

any-to-any
1B 16,808 476

IPostYellow/TurboWan2.1-T2V-1.3B-Diffusers

IPostYellow/TurboWan2.1-T2V-1.3B-Diffusers

Video
1.3B 16,221 0

Abiray/LTX-2.3-22B-DISTILLED-1.1-GGUF

Abiray/LTX-2.3-22B-DISTILLED-1.1-GGUF

Video
22B 14,738 9

nvidia/GR00T-N1.7-3B

nvidia/GR00T-N1.7-3B

robotics
3B 13,857 24

allenai/specter

allenai/specter

feature-extraction
12,812 65

allenai/MolmoPoint-Vid-4B

allenai/MolmoPoint-Vid-4B

video-text-to-text
4B 12,773 9

google/medasr

google/medasr

automatic-speech-recognition
12,398 309

vrgamedevgirl84/Wan14BT2VFusioniX

vrgamedevgirl84/Wan14BT2VFusioniX

Video
14B 12,140 606

nvidia/omni-embed-nemotron-3b

nvidia/omni-embed-nemotron-3b

sentence-similarity
3B 11,870 119

stabilityai/stable-fast-3d

stabilityai/stable-fast-3d

image-to-3d
11,548 746

alibaba-pai/Wan2.1-Fun-14B-Control

alibaba-pai/Wan2.1-Fun-14B-Control

Video
14B 11,286 56

ByteDance/AnimateDiff-Lightning

ByteDance/AnimateDiff-Lightning

Video
11,153 984

QuantStack/Wan2.1_14B_VACE-GGUF

QuantStack/Wan2.1_14B_VACE-GGUF

Video
14B 10,617 243

google/tipsv2-b14

google/tipsv2-b14

zero-shot-image-classification
10,416 90

stabilityai/stable-video-diffusion-img2vid-xt-1-1

stabilityai/stable-video-diffusion-img2vid-xt-1-1

image-to-video
10,287 999

LiquidAI/LFM2.5-8B-A1B

LiquidAI/LFM2.5-8B-A1B

text-generation
8B 8,854 168

BAAI/bge-code-v1

BAAI/bge-code-v1

sentence-similarity
8,739 52

genmo/mochi-1-preview

genmo/mochi-1-preview

Video
7,930 1,324

moonshotai/MoonViT-SO-400M

moonshotai/MoonViT-SO-400M

image-feature-extraction
7,774 42

meta-llama/Llama-Prompt-Guard-2-22M

meta-llama/Llama-Prompt-Guard-2-22M

text-classification
7,559 41

jayn7/HunyuanVideo-1.5_T2V_720p-GGUF

jayn7/HunyuanVideo-1.5_T2V_720p-GGUF

Video
7,548 9

allenai/specter2_aug2023refresh_base

allenai/specter2_aug2023refresh_base

feature-extraction
7,520 3

google/tipsv2-l14

google/tipsv2-l14

zero-shot-image-classification
7,274 12

allenai/Molmo2-VideoPoint-4B

allenai/Molmo2-VideoPoint-4B

video-text-to-text
4B 7,077 19

Wan-AI/Wan2.2-TI2V-5B

Wan-AI/Wan2.2-TI2V-5B

Video
5B 6,559 584

BAAI/bge-reranker-v2-minicpm-layerwise

BAAI/bge-reranker-v2-minicpm-layerwise

text-classification
6,493 64

calcuis/wan-1.3b-gguf

calcuis/wan-1.3b-gguf

Video
1.3B 5,949 38

cerspense/zeroscope_v2_576w

cerspense/zeroscope_v2_576w

Video
5,841 492

hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_t2v

hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_t2v

Video
5,527 1

mistralai/Voxtral-4B-TTS-2603

mistralai/Voxtral-4B-TTS-2603

text-to-speech
4B 5,005 778

stabilityai/stable-audio-open-small

stabilityai/stable-audio-open-small

text-to-audio
4,941 254

Wan-AI/Wan2.2-T2V-A14B

Wan-AI/Wan2.2-T2V-A14B

Video
14B 4,756 479

QuantStack/Wan2.2-S2V-14B-GGUF

QuantStack/Wan2.2-S2V-14B-GGUF

Video
14B 4,745 75

Skywork/SkyReels-V2-DF-1.3B-540P-Diffusers

Skywork/SkyReels-V2-DF-1.3B-540P-Diffusers

Video
1.3B 4,461 2

google/tipsv2-so400m14

google/tipsv2-so400m14

zero-shot-image-classification
4,206 6

BAAI/Emu3-VisionTokenizer

BAAI/Emu3-VisionTokenizer

feature-extraction
4,184 63

deepseek-ai/Janus-1.3B

deepseek-ai/Janus-1.3B

any-to-any
1.3B 3,858 595

guoyww/animatediff-motion-adapter-v1-5-2

guoyww/animatediff-motion-adapter-v1-5-2

Video
3,687 29

alibaba-pai/Wan2.1-Fun-Reward-LoRAs

alibaba-pai/Wan2.1-Fun-Reward-LoRAs

Video
3,633 60

city96/HunyuanVideo-gguf

city96/HunyuanVideo-gguf

Video
3,596 188

calcuis/ltxv0.9.6-gguf

calcuis/ltxv0.9.6-gguf

Video
3,512 11

Motif-Technologies/Motif-Video-2B

Motif-Technologies/Motif-Video-2B

Video
2B 3,509 97

BAAI/bge-m3-unsupervised

BAAI/bge-m3-unsupervised

sentence-similarity
3,406 18

BAAI/BGE-VL-large

BAAI/BGE-VL-large

sentence-similarity
3,242 22

BestWishYsh/Helios-Distilled

BestWishYsh/Helios-Distilled

Video
14B 3,219 43

samuelchristlie/Wan2.1-T2V-1.3B-GGUF

samuelchristlie/Wan2.1-T2V-1.3B-GGUF

Video
1.3B 2,926 16

allenai/aspire-biencoder-biomed-scib

allenai/aspire-biencoder-biomed-scib

feature-extraction
2,800 0

wanabmeya/clip_vision_h.safetensors

wanabmeya/clip_vision_h.safetensors

Video
2,725 0

nvidia/nemotron-ocr-v2

nvidia/nemotron-ocr-v2

image-to-text
2,706 171

QuantStack/Wan2.2-Fun-A14B-Control-GGUF

QuantStack/Wan2.2-Fun-A14B-Control-GGUF

Video
14B 2,397 35

google/tipsv2-l14-dpt

google/tipsv2-l14-dpt

depth-estimation
2,336 3

jayn7/HunyuanVideo-1.5_T2V_480p-GGUF

jayn7/HunyuanVideo-1.5_T2V_480p-GGUF

Video
2,288 6

BAAI/bge-en-icl

BAAI/bge-en-icl

feature-extraction
2,279 136

Lightricks/LTX-Video-0.9.7-distilled

Lightricks/LTX-Video-0.9.7-distilled

Video
2,242 58

QuantStack/Wan2.2-Fun-A14B-InP-GGUF

QuantStack/Wan2.2-Fun-A14B-InP-GGUF

Video
14B 2,241 16

allenai/MolmoAct-7B-D-LIBERO-Goal-0812

allenai/MolmoAct-7B-D-LIBERO-Goal-0812

robotics
7B 2,137 0

tiiuae/Falcon-Perception-300M

tiiuae/Falcon-Perception-300M

object-detection
2,089 11

Runware/Wan2.2-TI2V-5B

Runware/Wan2.2-TI2V-5B

Video
5B 2,051 0

nvidia/Cosmos-1.0-Diffusion-7B-Text2World

nvidia/Cosmos-1.0-Diffusion-7B-Text2World

Video
7B 2,006 233

google/tipsv2-g14

google/tipsv2-g14

zero-shot-image-classification
1,871 9

guoyww/animatediff-motion-lora-zoom-out

guoyww/animatediff-motion-lora-zoom-out

Video
1,865 8

guoyww/animatediff-motion-lora-zoom-in

guoyww/animatediff-motion-lora-zoom-in

Video
1,848 9

guoyww/animatediff-motion-lora-pan-right

guoyww/animatediff-motion-lora-pan-right

Video
1,823 4

guoyww/animatediff-motion-lora-pan-left

guoyww/animatediff-motion-lora-pan-left

Video
1,815 3

guoyww/animatediff-motion-adapter-v1-5-3

guoyww/animatediff-motion-adapter-v1-5-3

Video
1,810 10

guoyww/animatediff-motion-lora-tilt-down

guoyww/animatediff-motion-lora-tilt-down

Video
1,782 5

guoyww/animatediff-motion-lora-tilt-up

guoyww/animatediff-motion-lora-tilt-up

Video
1,782 2

tencent/HunyuanVideo-1.5

tencent/HunyuanVideo-1.5

Video
1,537 981

BAAI/EVA-CLIP-18B

BAAI/EVA-CLIP-18B

feature-extraction
18B 1,460 16

BAAI/BGE-VL-base

BAAI/BGE-VL-base

sentence-similarity
1,411 28

BAAI/EVA-CLIP-8B

BAAI/EVA-CLIP-8B

feature-extraction
8B 1,346 50

stabilityai/stable-point-aware-3d

stabilityai/stable-point-aware-3d

image-to-3d
1,119 345

nvidia/parakeet-unified-en-0.6b

nvidia/parakeet-unified-en-0.6b

automatic-speech-recognition
0.6B 837 37

google/tipsv2-b14-dpt

google/tipsv2-b14-dpt

depth-estimation
738 11

tiiuae/siglino-0.6B

tiiuae/siglino-0.6B

image-feature-extraction
0.6B 576 13

BAAI/RoboBrain2.0-7B

BAAI/RoboBrain2.0-7B

robotics
7B 526 124

google/tipsv2-g14-dpt

google/tipsv2-g14-dpt

depth-estimation
509 9

deepseek-ai/JanusFlow-1.3B

deepseek-ai/JanusFlow-1.3B

any-to-any
1.3B 478 151

stabilityai/japanese-stable-clip-vit-l-16

stabilityai/japanese-stable-clip-vit-l-16

feature-extraction
466 28

tiiuae/siglino-70M

tiiuae/siglino-70M

image-feature-extraction
318 6

tiiuae/siglino-30M

tiiuae/siglino-30M

image-feature-extraction
307 6

BAAI/BGE-VL-MLLM-S2

BAAI/BGE-VL-MLLM-S2

sentence-similarity
7B 303 17

nvidia/asset-harvester

nvidia/asset-harvester

image-to-3d
284 34

BAAI/Emu3.5-Image

BAAI/Emu3.5-Image

image-text-to-image
281 75

BAAI/bge-reasoner-embed-qwen3-8b-0923

BAAI/bge-reasoner-embed-qwen3-8b-0923

feature-extraction
8B 263 26

google/tipsv2-so400m14-dpt

google/tipsv2-so400m14-dpt

depth-estimation
243 3

BAAI/bge-m3-retromae

BAAI/bge-m3-retromae

feature-extraction
213 18

BAAI/BGE-VL-MLLM-S1

BAAI/BGE-VL-MLLM-S1

sentence-similarity
7B 205 23

HuggingFaceH4/tiny-random-LlamaForSequenceClassification

HuggingFaceH4/tiny-random-LlamaForSequenceClassification

text-classification
181 0

BAAI/RoboBrain2.0-3B

BAAI/RoboBrain2.0-3B

robotics
3B 172 13

moonshotai/Kimi-Audio-7B

moonshotai/Kimi-Audio-7B

text-to-speech
7B 170 78

BAAI/BGE-VL-v1.5-zs

BAAI/BGE-VL-v1.5-zs

sentence-similarity
7B 166 9

BAAI/EVA-CLIP-8B-448

BAAI/EVA-CLIP-8B-448

feature-extraction
8B 155 15

tiiuae/siglino-moe-0.15-0.6B

tiiuae/siglino-moe-0.15-0.6B

image-feature-extraction
0.6B 145 7

BAAI/BGE-VL-Screenshot

BAAI/BGE-VL-Screenshot

sentence-similarity
3B 139 17

tiiuae/siglino-moe-0.3-0.6B

tiiuae/siglino-moe-0.3-0.6B

image-feature-extraction
0.6B 133 7

BAAI/BGE-VL-v1.5-mmeb

BAAI/BGE-VL-v1.5-mmeb

sentence-similarity
7B 114 12

HuggingFaceH4/vsft-llava-1.5-7b-hf-trl

HuggingFaceH4/vsft-llava-1.5-7b-hf-trl

image-to-text
7B 56 19

stabilityai/stable-diffusion-xl-refiner-0.9

stabilityai/stable-diffusion-xl-refiner-0.9

image-to-image
45 334

stabilityai/stable-codec-speech-16k

stabilityai/stable-codec-speech-16k

audio-to-audio
42 24

HuggingFaceH4/Qwen2.5-Math-1.5B-Instruct-PRM-0.2

HuggingFaceH4/Qwen2.5-Math-1.5B-Instruct-PRM-0.2

token-classification
1.5B 39 0

stabilityai/stable-video-diffusion-img2vid-xt-1-1-tensorrt

stabilityai/stable-video-diffusion-img2vid-xt-1-1-tensorrt

image-to-video
32 30

stabilityai/japanese-instructblip-alpha

stabilityai/japanese-instructblip-alpha

image-to-text
20 53

stabilityai/japanese-stable-vlm

stabilityai/japanese-stable-vlm

image-to-text
18 53

BAAI/RoboBrain2.0-32B

BAAI/RoboBrain2.0-32B

robotics
32B 12 44

HuggingFaceH4/Qwen2.5-Math-7B-Instruct-PRM-0.2

HuggingFaceH4/Qwen2.5-Math-7B-Instruct-PRM-0.2

token-classification
7B 10 0

HuggingFaceH4/tiny-random-LlamaForSeqClass

HuggingFaceH4/tiny-random-LlamaForSeqClass

text-classification
8 0

stabilityai/stable-codec-speech-16k-base

stabilityai/stable-codec-speech-16k-base

audio-to-audio
4 3

BAAI/RoboBrain-X0-Preview

BAAI/RoboBrain-X0-Preview

robotics
0 11

stabilityai/stable-zero123

stabilityai/stable-zero123

text-to-3d
0 764

Showing top 631 models. Use search above to find any of the 7,438+ models.

Frequently Asked Questions

How much VRAM do I need to run an LLM locally?

It depends on model size and quantization. The formula is: VRAM (GB) = parameters × bytes_per_param + 1.5 GB overhead. Q4_K_M uses 0.5 bytes/param, Q8 uses 1.0, FP16 uses 2.0. A 7B model needs ~5 GB at Q4_K_M and ~16 GB at FP16. Use the search above to find exact estimates for any model.

What GPU is best for running LLMs locally in 2026?

The NVIDIA RTX 4090 (24 GB) is the best consumer GPU for local LLM inference — it fits 13B–34B models at Q8 and 7B at FP16. The RTX 5090 (32 GB) extends that to 34B at Q8. For larger models like 70B, an Apple Silicon Mac Studio M4 Max (64–128 GB unified memory) is often more practical than a multi-GPU PC setup.

Can I run Llama 3 70B on a consumer GPU?

A single RTX 4090 (24 GB) is not enough for Llama 3 70B — it requires ~37 GB at Q4_K_M. You need a Mac Studio M4 Max with 64 GB+, dual RTX 4090s (48 GB combined via llama.cpp split), or an RTX 5090. The 8B variant runs easily on a single RTX 4060 or Mac mini M4.

What is quantization and why does it matter for GPU selection?

Quantization compresses model weights to fewer bits, reducing VRAM at a small quality cost. Q4_K_M (4-bit) halves VRAM vs FP16 with ~1–3% quality loss — the most popular format for consumer GPUs. Q8 (8-bit) is near-lossless. FP16 gives maximum quality but requires the most VRAM. Choosing the right quantization can mean the difference between a model fitting on your GPU or not.

Can I run LLMs on a Mac?

Yes — Apple Silicon is excellent for local LLM inference. The Mac mini M4 (16 GB) handles 7B–13B models. The Mac Studio M4 Pro (24–48 GB) covers 13B–34B models. The Mac Studio M4 Max (64–128 GB) can run 70B models at Q8 quality. Tools like Ollama, LM Studio, and llama.cpp all support Apple Silicon via Metal.

Is the Intel Arc B580 good for running LLMs?

Yes — the Intel Arc B580 (12 GB GDDR6) is one of the best value GPUs for LLMs in 2026. It gives 12 GB VRAM, enough for 13B models at Q4_K_M. It works with llama.cpp via Vulkan/SYCL, LM Studio, and Jan.ai. It does not support CUDA, so it is slightly slower than NVIDIA on some tasks, but performance per dollar is outstanding for budget builds.

Is the RTX 3090 still worth buying used for LLMs in 2026?

Yes — the RTX 3090 (24 GB VRAM) is still excellent value on the used market. It provides the same 24 GB VRAM as the RTX 4090 at roughly one-third the price, runs 32B models at Q4_K_M comfortably, and is about 20% slower per token. For users who prioritize maximum VRAM per dollar, the 3090 remains the best used-market pick for 24 GB VRAM in 2026.

Is the RTX 4060 Ti 16GB good for LLMs?

Yes — the RTX 4060 Ti 16GB is one of the best value NVIDIA GPUs for local LLMs. 16 GB is enough for 13B models at Q8 (~14 GB) and 20B models at Q4_K_M (~12 GB). It is slower than the RTX 4090 but costs roughly one-third as much. See the full RTX 4060 Ti guide for benchmark speeds.

Is the RTX 4070 Ti Super good for running LLMs?

Yes — the RTX 4070 Ti Super 16GB runs 13B at Q8 and 20B at Q4_K_M with 672 GB/s bandwidth — 2.3x faster than the RTX 4060 Ti 16GB (288 GB/s) at the same VRAM. It is the fastest 16 GB consumer GPU available and sits cleanly between the 4060 Ti and RTX 4090 in price and performance.

Can I run DeepSeek R1 locally?

Yes — but which version matters. The DeepSeek-R1-Distill models (7B, 8B, 14B, 32B, 70B) are standard dense models you can run on consumer hardware. The 7B distill needs ~6 GB at Q4 (RTX 4060 8GB works), the 14B needs ~9 GB at Q4 (Arc B580 12GB), and the 32B needs ~18 GB at Q4 (RTX 4090 24GB). The full DeepSeek R1 671B is a MoE model requiring server-class hardware. See the DeepSeek hardware guide for per-GPU compatibility.

What GPU do I need for Llama 4?

Llama 4 uses Mixture of Experts (MoE) architecture, which means it requires more VRAM than the "17B" name suggests. Llama 4 Scout (17B-16E) has ~109B total parameters across 16 experts — all of which must be loaded. The Q4 GGUF files are ~58–62 GB, requiring a Mac Studio M4 Max 64GB or 128GB, or dual RTX 4090s. For easier local inference, consider Llama 3.1 8B or Llama 3.3 70B instead.

What GPU do I need to run Qwen3 locally?

Qwen3 models are very accessible. Qwen3 4B at Q4_K_M needs ~3 GB (any 8 GB GPU); Qwen3 8B needs ~5 GB at Q4 (RTX 4060); Qwen3 14B needs ~9 GB at Q4 (12 GB GPU) or 15 GB at Q8 (16 GB GPU); Qwen3 32B needs ~18 GB at Q4 (RTX 4090 or Mac mini M4 Pro 48 GB). All Qwen3 models support thinking mode for chain-of-thought reasoning. Install via Ollama: ollama run qwen3:8b.

Is Ollama or LM Studio better for local LLMs?

It depends on your workflow. Ollama is better for developers: it runs as a background API service, works in Docker, and integrates with code. LM Studio is better for beginners: it has a polished GUI, a model browser, and a built-in chat interface. Both use llama.cpp under the hood so speed is identical on the same hardware. Many users install both. See the full comparison guide.

Is the RTX 4090 worth it for LLMs?

Yes, if you want the best single-GPU performance for local AI. The RTX 4090 24GB runs Qwen3 32B at Q4_K_M (~20 GB) at 35-48 tok/s and delivers up to 110 tok/s on 8B models thanks to 1008 GB/s bandwidth. It costs more than the RTX 4080, but the extra 8 GB of VRAM is a meaningful upgrade for running 30-34B models. The main limit: 70B models still do not fit at Q4_K_M.

What GPU do I need for Gemma 3 locally?

Gemma 3 model size determines VRAM needs. Gemma 3 4B fits in any 8 GB GPU (RTX 4060, Arc B580) at ~3 GB Q4_K_M. Gemma 3 12B needs 8 GB+ at Q4 (~8 GB). Gemma 3 27B, the flagship, needs 16 GB at Q4_K_M (~16 GB) — the RTX 4060 Ti 16GB, RTX 4080, or RTX 4090 are all good choices. On Mac, the Mac mini M4 24GB handles Gemma 3 12B comfortably.

What hardware do I need to run Llama 3.3 70B locally?

Llama 3.3 70B is Meta's best open-source model and requires ~43 GB VRAM at Q4_K_M. The easiest option is the Mac Studio M4 Max 64GB, which runs it at 14-20 tok/s. The RTX 5090 32GB fits only the lower-quality Q2_K quantization (~26 GB). Two RTX 4090s (48 GB combined) can run it via llama.cpp tensor splitting. Note: there is no 7B or 13B Llama 3.3 — only the 70B was released. For 24 GB GPUs, Qwen3 32B at Q4_K_M is a strong alternative.

What GPU do I need to run Phi-4 locally?

Phi-4 (14B) needs ~9 GB VRAM at Q4_K_M — any 12 GB GPU like the Intel Arc B580 or RTX 4070 handles it easily. Phi-4-mini (3.8B) needs only 2.5 GB and runs on any hardware including CPU. Despite being only 14B parameters, Phi-4 scores near 70B models on many benchmarks, making it an excellent VRAM-efficient choice for 8–12 GB GPUs.

What is the cheapest way to run 70B LLMs locally?

The cheapest single device that comfortably runs Llama 3.3 70B at Q4_K_M is the Mac mini M4 Pro with 48 GB unified memory. It runs 70B at 10-14 tok/s — acceptable for personal use. The Mac Studio M4 Max 64GB is faster at 14-20 tok/s. Dual RTX 4090s give higher token speed but cost considerably more for the GPUs alone, making the Mac mini M4 Pro the budget pick for 70B inference.

Can I run AI on my gaming PC?

Yes — your gaming GPU is exactly what local AI runs on. The GPU VRAM (not system RAM) determines which models you can run. An RTX 4060 8GB runs Qwen3 7B Q8 at 35 tok/s. An RTX 4080 16GB runs Qwen3 14B Q8 at 30 tok/s. An RTX 4090 24GB runs Qwen3 32B Q4 at 35-48 tok/s. Just install Ollama (free) and run ollama run qwen3:8b — it takes under 5 minutes.

Is local AI private? Does it send data to the cloud?

Local LLMs are 100% private — your data never leaves your computer. Tools like Ollama and LM Studio run the model entirely on your hardware with no network connection required after the initial model download. Compare this to ChatGPT or Claude, which process your data on their servers. Local AI is ideal for confidential documents, medical notes, or any sensitive work.

What LLMs can I run with 8 GB VRAM?

8 GB VRAM is enough for useful AI. The best options: Qwen3 7B Q8_0 (7.2 GB, 35 tok/s) is the top pick for quality. Qwen3 7B Q4_K_M (4.5 GB, 50 tok/s) is faster. Phi-4 14B Q4_K_M is possible at ~8.5 GB but requires exactly matching VRAM. Gemma 3 9B Q4 (5.5 GB) and Mistral 7B Q8 (7.2 GB) are solid alternatives. RTX 4060 and RTX 3060 users have the same 8 GB VRAM — 4060 is about 20% faster.

How do I run LLMs on Linux?

One command installs Ollama: curl -fsSL https://ollama.com/install.sh | sh. After that, run any model with ollama run qwen3:8b. NVIDIA GPUs work automatically after driver install. AMD GPUs need ROCm: sudo apt install rocm-hip-sdk then add yourself to the render and video groups and reboot. Intel Arc uses the i915 kernel driver with OpenCL. Ubuntu 22.04/24.04 has the best support; Fedora and Arch also work well.

Should I buy M4 Mac Mini or M4 Pro Mac Mini for LLMs?

M4 Mac Mini 24 GB is the sweet spot: runs all 7-14B models and is the best value entry point. M4 Pro 24 GB is 2.3x faster due to 273 vs 120 GB/s bandwidth — worth the premium for daily use. M4 Pro 48 GB is the only Mac Mini that fits Llama 3.3 70B. Skip the M4 16 GB — after the OS takes 4-5 GB, only 11 GB remains for models.

Should I buy RTX 5080 or RTX 4090 for local LLMs?

For 7-14B models, buy the RTX 5080: 16 GB GDDR7 runs identical models to the 4090 at similar speed for less money. For 32B models (Llama 3.3 32B Q4_K_M is 18.5 GB), only the RTX 4090 24 GB fits. Neither runs 70B at Q4. Buy the 5080 for everyday use; buy the 4090 only if 32B inference is a specific requirement.

How much does it cost to run LLMs locally?

After the one-time hardware cost, running LLMs locally is free — no subscription, no API fees, no per-token charges. A mid-range setup (such as an RTX 4070) plus the free Ollama software lets you run Qwen3 14B indefinitely. Electricity cost is minimal: roughly a few cents per hour of inference on a mid-range GPU.

Can I run AI models locally on a laptop?

Yes, but with limitations. Laptops with discrete GPUs (RTX 4060 mobile, RTX 4070 mobile) run 7-8B models well at 20-30 t/s. Integrated graphics (Intel Iris, AMD Radeon integrated) can run 7B models at 3-8 t/s via CPU. Apple MacBook Pro with M4 is excellent — the M4 Pro 24GB MacBook Pro runs 14B models at 40+ t/s. Gaming laptops with 8-16 GB VRAM are solid LLM machines.

What is the best free software to run LLMs locally?

Ollama is the most popular choice — one command installs it, another runs a model. It supports every major model and works on Windows, Mac, and Linux. LM Studio is better if you prefer a graphical interface with no terminal. Open WebUI adds a ChatGPT-like browser interface on top of Ollama. All three are free and open source.

How do I know if my GPU is good enough for AI?

Check your GPU's VRAM: 6-8 GB runs 7B models, 12 GB runs 14B models, 16-24 GB runs 32B models. If you have an NVIDIA GPU from the RTX 3000 or 4000/5000 series, or AMD RX 6000/7000/9000 series, or Apple Silicon, you're good. Use the VRAM Calculator on this site to find exactly which models fit your specific GPU.

Want exact VRAM estimates? Use the VRAM Calculator or search any model above.