Quickstart
Get your first model endpoint running in under 5 minutes.
Prerequisites
predictor.sh CLI installed (Installation)
A predictor.sh account
Step 1: Authenticate
predictor loginThis opens your browser for OAuth authentication. Once approved, you're ready to go.
Step 2: Serve a Model
Option A: Local Model File
If you have a GGUF model file:
predictor up ./llama-7b-q4.ggufOption B: HuggingFace Model
Download and serve directly from HuggingFace:
Step 3: Use Your Endpoint
Once running, you'll see output like:
Test with curl
Use with OpenAI SDK
Step 4: Monitor Your Endpoint
View live stats in the terminal UI, or check logs:
Step 5: Shutdown
Press Ctrl+C in the terminal, or from another terminal:
Your URL remains reserved for when you come back online.
Next Steps
Text Generation Models - Learn about supported LLM formats
Speech-to-Text - Set up Whisper transcription
Text-to-Speech - Add voice synthesis
API Reference - Full API documentation
Last updated