Environment Variables
predictor.sh reads these environment variables:
Reference
Variable
Description
Required
Default
HF_TOKEN
HuggingFace access token
For gated models
-
CUDA_VISIBLE_DEVICES
GPU selection
No
All GPUs
RUST_LOG
Log level
No
info
ORT_DYLIB_PATH
ONNX Runtime library path
No
Bundled
HuggingFace Token
Required for gated models like Llama 3, Mistral, etc.
Get Your Token
Create a new token with "Read" access
Accept the model license on its HuggingFace page
Set the Token
# Export for session
export HF_TOKEN=hf_xxxxxxxxxxxxx
# Or set inline
HF_TOKEN=hf_xxx predictor up --hf meta-llama/Llama-3-8BPersistent Setup
Add to your shell profile (~/.bashrc, ~/.zshrc):
GPU Selection
Control which GPUs predictor.sh uses:
Debug Logging
Enable detailed logging for troubleshooting:
Alternatively, use the --verbose flag:
ONNX Runtime Path
Override the bundled ONNX Runtime library:
This is rarely needed. Use only if you require a specific ONNX Runtime version.
Last updated