Enter your specs. Get model recommendations with Ollama install commands.
All models tested with OpenClaw for tool calling reliability. Q4_K_M quantization unless noted.
| Model | Params | VRAM (Q4) | VRAM (Q8) | Tool Calling | Speed | Best For |
|---|---|---|---|---|---|---|
| Qwen 3 8B | 8B | 5 GB | 9 GB | Unreliable | Fast | Testing only |
| Llama 3.2 8B | 8B | 5 GB | 9 GB | Unreliable | Fast | Simple chat |
| Phi-4 14B | 14B | 9 GB | 16 GB | Moderate | Medium | Budget agent |
| Qwen 3.5 27B ★ | 27B | 16 GB | 29 GB | Reliable | Medium | Best free agent |
| Qwen2.5-Coder 32B | 32B | 20 GB | 35 GB | Excellent | Medium | Code + agent |
| DeepSeek V3 | 37B* | 22 GB | 40 GB | Reliable | Slow | Reasoning |
| Mistral Large | 123B | 70 GB | 130 GB | Excellent | Slow | Premium local |
| Llama 3.3 70B | 70B | 40 GB | 75 GB | Excellent | Slow | Power user |
★ Recommended. * MoE active params. VRAM estimates are approximate and vary by context length and implementation.
| Level | Bytes/Param | Quality | Use When |
|---|---|---|---|
| Q4_K_M | 0.57 | ~95% | Default choice. Best balance of quality and memory. |
| Q5_K_M | 0.69 | ~97% | Slightly better quality, 20% more VRAM. |
| Q8_0 | 1.0 | ~99% | Maximum quality. Double the VRAM of Q4. |
| FP16 | 2.0 | 100% | Full precision. Only if you have VRAM to spare. |