IllumiChat applies rate limits to protect platform stability and ensure fair usage.
Rate Limit Tiers
| Tier | Limit | Scope |
|---|
| Authenticated endpoints | Generous limits per user | Per user |
| Widget public endpoints | 5 requests per minute | Per IP address |
| File downloads | 20 requests per minute | Per user |
| SMS webhooks | Governed by Twilio sending rate | Per assistant |
Authenticated endpoint limits are designed to support normal application usage. If you are building a high-throughput integration, see the guidance below on requesting increased limits.
Widget usage is tracked per assistant and subject to quota limits based on your subscription plan.
| Detail | Description |
|---|
| Tracking | Message count is tracked per assistant |
| Reset cycle | Quotas reset periodically |
| Plan limits | Quota thresholds depend on your workspace subscription tier |
| Enforcement | Requests exceeding the quota receive a 429 response |
Rate Limit Response
When any rate limit is exceeded, the API returns HTTP status 429:
{
"error": "Rate limit exceeded",
"code": "rate_limit:widget"
}
Handling Rate Limits
Implement Exponential Backoff
async function fetchWithRetry(url, options, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status !== 429) return response;
const delayMs = Math.min(1000 * Math.pow(2, attempt), 30000);
console.warn(`Rate limited. Retrying in ${delayMs}ms...`);
await new Promise((resolve) => setTimeout(resolve, delayMs));
}
throw new Error("Max retries exceeded");
}
Additional Strategies
- Cache responses that do not change frequently (e.g., assistant configurations)
- Batch operations where the API supports it
- Monitor usage to identify opportunities for optimization
Requesting Higher Limits
If your use case requires higher rate limits, contact the IllumiChat support team with:
- Your workspace ID
- The endpoints you need higher limits for
- Your expected request volume
- A description of your integration
Before requesting higher limits, review your integration for opportunities to reduce request volume through caching, batching, and efficient polling intervals.