Environment Variables

predictor.sh reads these environment variables:

Reference

Variable
Description
Required
Default

HF_TOKEN

HuggingFace access token

For gated models

-

CUDA_VISIBLE_DEVICES

GPU selection

No

All GPUs

RUST_LOG

Log level

No

info

ORT_DYLIB_PATH

ONNX Runtime library path

No

Bundled


HuggingFace Token

Required for gated models like Llama 3, Mistral, etc.

Get Your Token

  1. Create a new token with "Read" access

  2. Accept the model license on its HuggingFace page

Set the Token

# Export for session
export HF_TOKEN=hf_xxxxxxxxxxxxx

# Or set inline
HF_TOKEN=hf_xxx predictor up --hf meta-llama/Llama-3-8B

Persistent Setup

Add to your shell profile (~/.bashrc, ~/.zshrc):


GPU Selection

Control which GPUs predictor.sh uses:

This is a standard NVIDIA environment variable. See CUDA documentation for details.


Debug Logging

Enable detailed logging for troubleshooting:

Alternatively, use the --verbose flag:


ONNX Runtime Path

Override the bundled ONNX Runtime library:

This is rarely needed. Use only if you require a specific ONNX Runtime version.

Last updated