Welcome

predictor.sh is a CLI that turns local AI models into secure, internet-accessible API endpoints in seconds.

What is predictor.sh?

With predictor.sh, you can expose local ML inference to the internet with a single command. Get a stable HTTPS URL, authentication, and observability without any cloud deployment complexity.

# Local model
predictor up ./llama-7b.gguf

# Or a HuggingFace model
predictor up TheBloke/Llama-2-7B-GGUF

# Your model is now live at https://abc123.predictor.sh

Key Features

Feature

Description

One Command

Go from local model to public API in seconds

Stable URLs

Your endpoint URL survives disconnects and restarts

OpenAI Compatible

Drop-in replacement for OpenAI APIs

Multi-Modal

Text, speech-to-text (Whisper), and text-to-speech

GPU Accelerated

Native Metal (macOS) and CUDA (NVIDIA) support

Built-in Auth

Bearer token authentication out of the box

Supported Modalities

Modality

Formats

APIs

Text (LLM)

GGUF, SafeTensors, ONNX

OpenAI-compatible

Speech-to-Text

Whisper (16 variants)

OpenAI Whisper API

Text-to-Speech

Parler-TTS, Kokoro

OpenAI + ElevenLabs API

Quick Links

Installation

Get predictor.sh installed

Quickstart

Your first endpoint in 5 minutes

API Reference

OpenAI-compatible endpoints

NextInstallation

Last updated 5 hours ago