Skip to main content
IllumiChat applies rate limits to protect platform stability and ensure fair usage.

Rate Limit Tiers

TierLimitScope
Authenticated endpointsGenerous limits per userPer user
Widget public endpoints5 requests per minutePer IP address
File downloads20 requests per minutePer user
SMS webhooksGoverned by Twilio sending ratePer assistant
Authenticated endpoint limits are designed to support normal application usage. If you are building a high-throughput integration, see the guidance below on requesting increased limits.

Widget Quotas

Widget usage is tracked per assistant and subject to quota limits based on your subscription plan.
DetailDescription
TrackingMessage count is tracked per assistant
Reset cycleQuotas reset periodically
Plan limitsQuota thresholds depend on your workspace subscription tier
EnforcementRequests exceeding the quota receive a 429 response

Rate Limit Response

When any rate limit is exceeded, the API returns HTTP status 429:
{
  "error": "Rate limit exceeded",
  "code": "rate_limit:widget"
}

Handling Rate Limits

Implement Exponential Backoff

async function fetchWithRetry(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    if (response.status !== 429) return response;

    const delayMs = Math.min(1000 * Math.pow(2, attempt), 30000);
    console.warn(`Rate limited. Retrying in ${delayMs}ms...`);
    await new Promise((resolve) => setTimeout(resolve, delayMs));
  }
  throw new Error("Max retries exceeded");
}

Additional Strategies

  • Cache responses that do not change frequently (e.g., assistant configurations)
  • Batch operations where the API supports it
  • Monitor usage to identify opportunities for optimization

Requesting Higher Limits

If your use case requires higher rate limits, contact the IllumiChat support team with:
  • Your workspace ID
  • The endpoints you need higher limits for
  • Your expected request volume
  • A description of your integration
Before requesting higher limits, review your integration for opportunities to reduce request volume through caching, batching, and efficient polling intervals.