Documentation Index
Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
Use this file to discover all available pages before exploring further.
Retries
Retry failed requests automatically with exponential backoff. Configure which HTTP error codes trigger retries and how many attempts to make.
Fallbacks
Route to a different model when the primary fails. Define a fallback chain across providers for high availability.
Retries
Automatically retry failed requests with exponential backoff.Quick Start
Configuration
| Parameter | Type | Required | Description |
|---|---|---|---|
count | number | Yes | Max retry attempts (1-5) |
on_codes | number[] | No | HTTP status codes that trigger retries (default: [429]) |
Error Codes
| Code | Meaning | Retry? | Common Cause |
|---|---|---|---|
429 | Rate limit exceeded | Yes | Too many requests |
500 | Internal server error | Yes | Provider issue |
501 | Not implemented | Yes | Feature unavailable |
502 | Bad gateway | Yes | Network/Gateway issue |
503 | Service unavailable | Yes | Provider maintenance |
504 | Gateway timeout | Yes | Provider overload |
400 | Bad request | No | Invalid parameters |
401 | Unauthorized | No | Invalid API key |
403 | Forbidden | No | Access denied |
Retry Strategies
Backoff Algorithm
Exponential backoff with jitter
- Attempt 1: 1s (±25%)
- Attempt 2: 2s (±25%)
- Attempt 3: 4s (±25%)
- Attempt 4: 8s (±25%)
- Attempt 5: 16s (±25%)
Code examples
Best Practices
Production recommendations
Follow the following advice for a best production setup:- Use
count: 2-3for balance of reliability and speed - Always include
429(rate limits) inon_codes - Monitor retry rates to detect systemic issues
- Implement circuit breaker for persistent failures
Error handling
Troubleshooting
High retry rates- Check if you’re hitting rate limits frequently
- Verify API keys have sufficient quotas
- Monitor provider status pages for outages
- Reduce retry count for latency-sensitive apps
- Use shorter timeout values with retries
- Consider fallbacks for faster alternatives
- Check if error codes are in
on_codeslist - Verify retry count isn’t exhausted
- Review provider-specific error documentation
Monitoring
Track these retry metrics:Limitations
- Increased latency: Retries add delay (up to 31s for 5 attempts)
- Cost implications: Failed requests may still incur charges
- Rate limit consumption: Each retry counts against quotas
- Limited retries: Maximum 5 attempts to prevent excessive delays
- Non-retryable errors: 4xx client errors are not retried
Advanced Usage
Environment-specific configs:Fallbacks
Automatically switch to a different model when the primary fails.Quick Start
Configuration
| Parameter | Type | Required | Description |
|---|---|---|---|
fallbacks | Array | Yes | List of fallback models in order of preference |
model | string | Yes | Model identifier for each fallback |
Trigger Conditions
Fallbacks activate on these errors:| Error Code | Description | Triggers Fallback |
|---|---|---|
429 | Rate limit exceeded | Yes |
500 | Internal server error | Yes |
501 | Not implemented | Yes |
502 | Bad gateway | Yes |
503 | Service unavailable | Yes |
504 | Gateway timeout | Yes |
400 | Bad request | No |
401 | Unauthorized | No |
403 | Forbidden | No |
Best Practices
Use a maximum of 3 fallback models. Order them by preference or cost, and choose models with similar capabilities.Code examples
Limitations
- Response consistency: Different models may return varying output styles
- Parameter support: Not all providers support identical parameters
- Cost implications: Failed requests may still incur charges from the primary provider
- Latency impact: Sequential attempts add processing time
- Provider dependencies: Requires API keys for all fallback providers