Text Generation (LLM)
Supported Formats
Format
Runtime
Best For
Supported Architectures
Architecture
GGUF
SafeTensors
Popular Models
Loading Models
Local GGUF File
HuggingFace Model
Pre-download Models
Quantization
Quantization
Size Reduction
Quality
VRAM (7B model)
Gated Models
Benchmarking
API Endpoints
Last updated