\begin{article}
LaTeX API Rate Limiting and Retry Logic
Handle rate limits and transient failures in LaTeX API calls — exponential backoff, jitter, request queuing, and plan-aware throttling.

Production systems that compile LaTeX documents at scale will inevitably hit rate limits. Whether you're batch-processing thousands of academic papers, generating invoices on demand, or running a multi-tenant SaaS that calls the FormatEx compile endpoint, unhandled HTTP 429 responses mean lost compilations, cascading failures, and frustrated users. This guide covers every layer of the problem: reading the right response headers, implementing exponential backoff with jitter, building a client-side request queue, and choosing a plan that matches your actual throughput requirements.
Understanding HTTP 429 and the Retry-After Header
When you exceed your plan's rate limit, the FormatEx API returns a 429 Too Many Requests response. The response includes a Retry-After header that tells you the exact number of seconds to wait before retrying. Ignoring this header and immediately retrying is one of the most common bugs in API integrations — it hammers the server, prolongs your suspension window, and in some APIs can trigger permanent bans. For a complete reference of all error codes you may encounter, see the LaTeX API error codes guide.
A typical 429 response looks like this:
HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json
{"error": "rate limit exceeded, retry after 30 seconds"}Always read Retry-After as the floor — never retry sooner. If the header is absent (which can happen with proxy-level throttling), fall back to a sensible default like 60 seconds.
There are two distinct situations that produce 429s in the FormatEx API:
- Per-minute burst limits — sending too many requests in a short window. Wait the
Retry-Afterduration and retry the same request. - Monthly compilation quota exhausted — your plan's monthly cap is depleted. A short wait won't help; you need to upgrade your plan or wait for the monthly reset.
Distinguish between these by reading the error body, not just the status code. The monthly quota error message will say so explicitly.
Exponential Backoff with Jitter
Retrying after a fixed delay is better than not retrying, but it still causes thundering-herd problems when many clients hit the same limit simultaneously and all wake up at the same second. Exponential backoff increases the wait time after each consecutive failure. Adding random jitter spreads the retry attempts across time, preventing synchronized retry storms.
The formula for the wait duration before attempt n (zero-indexed) is:
wait = min(base * 2^n, cap) + random(0, jitter_factor * base * 2^n)Here's a working Python implementation for the FormatEx compile endpoint:
import time
import random
import httpx
def compile_latex(latex_source: str, api_key: str, engine: str = "pdflatex") -> bytes:
url = "https://api.formatex.io/api/v1/compile"
headers = {"X-API-Key": api_key, "Content-Type": "application/json"}
payload = {"latex": latex_source, "engine": engine}
base_delay = 1.0 # seconds
max_delay = 64.0 # cap at ~1 minute
max_attempts = 6
jitter_factor = 0.5 # up to 50% random jitter
for attempt in range(max_attempts):
try:
response = httpx.post(url, json=payload, headers=headers, timeout=120.0)
if response.status_code == 200:
return response.content
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 0))
exponential = base_delay * (2 ** attempt)
delay = min(max(retry_after, exponential), max_delay)
delay += random.uniform(0, jitter_factor * exponential)
print(f"Rate limited. Waiting {delay:.1f}s before attempt {attempt + 2}")
time.sleep(delay)
continue
if response.status_code in (500, 502, 503, 504):
# Transient server errors — also worth retrying
delay = min(base_delay * (2 ** attempt), max_delay)
delay += random.uniform(0, jitter_factor * delay)
time.sleep(delay)
continue
# Non-retryable errors (400, 401, 403, 422)
response.raise_for_status()
except httpx.TimeoutException:
if attempt == max_attempts - 1:
raise
delay = min(base_delay * (2 ** attempt), max_delay)
time.sleep(delay)
raise RuntimeError(f"Failed after {max_attempts} attempts")The key decisions in this implementation:
Retry-Aftertakes precedence over the exponential delay — we usemax(retry_after, exponential)to respect the server's explicit guidance.- 5xx errors are retried because they indicate transient server-side failures, not client mistakes.
- 4xx errors other than 429 are not retried — a
400 Bad Requestmeans your LaTeX is malformed and retrying won't help. - Timeouts are retried with backoff because LaTeX compilation is CPU-intensive and timeouts can be transient under load. For more on timeout configuration by engine, see LaTeX compilation timeouts in CI/CD.
Node.js and Go Implementations
If your stack is Node.js, here's an equivalent using the native fetch API with async/await:
async function compileLaTeX(
source: string,
apiKey: string,
engine: "pdflatex" | "xelatex" | "lualatex" | "latexmk" = "pdflatex"
): Promise<ArrayBuffer> {
const url = "https://api.formatex.io/api/v1/compile";
const maxAttempts = 6;
const baseDelay = 1000; // ms
const maxDelay = 64000; // ms
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const response = await fetch(url, {
method: "POST",
headers: {
"X-API-Key": apiKey,
"Content-Type": "application/json",
},
body: JSON.stringify({ latex: source, engine }),
signal: AbortSignal.timeout(120_000),
});
if (response.ok) {
return response.arrayBuffer();
}
const isRetryable =
response.status === 429 ||
response.status === 500 ||
response.status === 502 ||
response.status === 503 ||
response.status === 504;
if (!isRetryable || attempt === maxAttempts - 1) {
const body = await response.json().catch(() => ({}));
throw new Error(`Compile failed [${response.status}]: ${body.error ?? "unknown"}`);
}
const retryAfterHeader = response.headers.get("Retry-After");
const retryAfterMs = retryAfterHeader ? parseInt(retryAfterHeader) * 1000 : 0;
const exponential = baseDelay * Math.pow(2, attempt);
const jitter = Math.random() * exponential * 0.5;
const delay = Math.min(Math.max(retryAfterMs, exponential) + jitter, maxDelay);
await new Promise((resolve) => setTimeout(resolve, delay));
}
throw new Error("Max retry attempts exceeded");
}For a full production-ready TypeScript integration including Next.js patterns, see LaTeX PDF generation in Node.js and TypeScript. For Go, which is what the FormatEx backend itself is written in, the same pattern with net/http:
package latex
import (
"bytes"
"encoding/json"
"fmt"
"io"
"math"
"math/rand"
"net/http"
"strconv"
"time"
)
type CompileRequest struct {
LaTeX string `json:"latex"`
Engine string `json:"engine"`
}
func CompileWithRetry(source, apiKey, engine string) ([]byte, error) {
const (
maxAttempts = 6
baseDelay = time.Second
maxDelay = 64 * time.Second
)
client := &http.Client{Timeout: 120 * time.Second}
body, _ := json.Marshal(CompileRequest{LaTeX: source, Engine: engine})
for attempt := 0; attempt < maxAttempts; attempt++ {
req, _ := http.NewRequest(http.MethodPost,
"https://api.formatex.io/api/v1/compile",
bytes.NewReader(body))
req.Header.Set("X-API-Key", apiKey)
req.Header.Set("Content-Type", "application/json")
resp, err := client.Do(req)
if err != nil {
if attempt == maxAttempts-1 {
return nil, fmt.Errorf("request failed: %w", err)
}
time.Sleep(jitteredDelay(attempt, baseDelay, maxDelay))
continue
}
if resp.StatusCode == http.StatusOK {
defer resp.Body.Close()
return io.ReadAll(resp.Body)
}
resp.Body.Close()
retryable := resp.StatusCode == 429 ||
resp.StatusCode >= 500
if !retryable || attempt == maxAttempts-1 {
return nil, fmt.Errorf("compile failed with status %d", resp.StatusCode)
}
delay := jitteredDelay(attempt, baseDelay, maxDelay)
if ra := resp.Header.Get("Retry-After"); ra != "" {
if secs, err := strconv.Atoi(ra); err == nil {
serverDelay := time.Duration(secs) * time.Second
if serverDelay > delay {
delay = serverDelay
}
}
}
time.Sleep(delay)
}
return nil, fmt.Errorf("exceeded max attempts")
}
func jitteredDelay(attempt int, base, cap time.Duration) time.Duration {
exp := float64(base) * math.Pow(2, float64(attempt))
jitter := rand.Float64() * exp * 0.5
d := time.Duration(exp + jitter)
if d > cap {
d = cap
}
return d
}For a dedicated Go client guide with additional patterns, see calling the LaTeX API from Go.
Client-Side Request Queuing
When you're making many compilation requests concurrently — say, bulk PDF generation of 500 documents — throwing all requests at the API simultaneously is a recipe for immediate rate limiting. A client-side queue with controlled concurrency is far more effective than blanket retries.
The idea: maintain a pool of N concurrent workers, each pulling tasks from a queue, with the retry logic from above applied per-task. This prevents burst traffic from triggering rate limits in the first place.
Here's a TypeScript implementation using a simple async queue:
class CompileQueue {
private queue: Array<() => Promise<void>> = [];
private running = 0;
constructor(private concurrency: number) {}
async add(task: () => Promise<void>): Promise<void> {
return new Promise((resolve, reject) => {
this.queue.push(async () => {
try {
await task();
resolve();
} catch (err) {
reject(err);
} finally {
this.running--;
this.drain();
}
});
this.drain();
});
}
private drain() {
while (this.running < this.concurrency && this.queue.length > 0) {
const task = this.queue.shift()!;
this.running++;
task();
}
}
}
// Usage: 3 concurrent compilations at most
const queue = new CompileQueue(3);
const documents = [/* array of LaTeX source strings */];
const results = await Promise.allSettled(
documents.map((source) =>
queue.add(() =>
compileLaTeX(source, process.env.FORMATEX_API_KEY!, "pdflatex")
.then((pdf) => savePDF(pdf))
)
)
);Keep concurrency at 3—5 for most plans. For the Scale plan with its higher monthly cap, you can push to 10—15 concurrent requests without triggering burst limits.
Choosing the Right Plan to Avoid Rate Limits
The most reliable way to avoid rate limits is to be on the plan that fits your actual usage. Retries and queuing are safety nets — not substitutes for adequate capacity.
| Plan | Monthly Compilations | Engines | Best For |
|---|---|---|---|
| Free | 15 | pdflatex only | Evaluation, hobby projects |
| Developer ($12/mo) | 100 | All 4 | Side projects, small apps |
| Pro ($49/mo) | 500 | All 4 | Production apps, startups |
| Scale ($199/mo) | 2,000 | All 4 | High-volume batch processing |
A few practical rules for choosing:
- Calculate your peak day, not your average. If you process 1,000 documents in the last three days of each month (deadline crunch), you need a plan that can handle 1,000 compilations, not 100.
- Account for retries. Every retry attempt counts as a compilation against your quota. If your error rate is 5% and you retry up to 3 times, your effective quota consumption is closer to 115% of your successful compilation count.
- Use
xelatexorlualatexonly when necessary. These engines are significantly slower thanpdflatex. If you only need Unicode support,xelatexis fine; for pure English documents,pdflatexis faster and uses less of your per-request timeout budget. See the complete guide to LaTeX engines to understand the trade-offs. - Monitor usage in the dashboard. The FormatEx dashboard shows your monthly consumption. Set up alerts at 70% and 90% usage to give yourself time to upgrade before hitting the hard cap on the Free plan.
The hard block on the Free plan means compilations fail with 403 Forbidden (not 429) when the monthly quota is exhausted. Retrying won't help — you need to upgrade or wait for the monthly reset.
Testing Your Retry Logic
Don't wait for production incidents to validate your retry logic. Test it explicitly:
import unittest
from unittest.mock import patch, MagicMock
import httpx
class TestRetryLogic(unittest.TestCase):
@patch("httpx.post")
def test_retries_on_429(self, mock_post):
# First two calls return 429, third returns 200 with PDF bytes
rate_limited = MagicMock()
rate_limited.status_code = 429
rate_limited.headers = {"Retry-After": "1"}
success = MagicMock()
success.status_code = 200
success.content = b"%PDF-1.4..."
mock_post.side_effect = [rate_limited, rate_limited, success]
result = compile_latex("\\documentclass{article}...", "test-key")
self.assertEqual(result[:4], b"%PDF")
self.assertEqual(mock_post.call_count, 3)
@patch("httpx.post")
def test_does_not_retry_on_400(self, mock_post):
bad_request = MagicMock()
bad_request.status_code = 400
bad_request.headers = {}
mock_post.return_value = bad_request
bad_request.raise_for_status.side_effect = httpx.HTTPStatusError(
"Bad Request", request=MagicMock(), response=bad_request
)
with self.assertRaises(httpx.HTTPStatusError):
compile_latex("not valid latex at all", "test-key")
# Should only have called once — no retry on 400
self.assertEqual(mock_post.call_count, 1)Resilient LaTeX API integrations aren't complicated, but they require intentional design. Read the Retry-After header, back off exponentially with jitter, queue requests to avoid burst limits, and pick a plan that matches your real throughput. Each of these layers adds compounding reliability.
If you're ready to start compiling LaTeX documents without the infrastructure headache, sign up at formatex.io — the Free plan gives you 15 compilations per month to validate your integration before committing to a paid tier.
Related Articles
- LaTeX API Error Codes: Complete Guide — Understand every status code you may encounter, including 422 compile failures and 429 rate limits, to handle errors correctly in your retry logic
- Bulk LaTeX PDF Generation via API — How to batch thousands of documents with concurrency controls and retry patterns, directly complementing the queuing strategies in this guide
- LaTeX Compilation Timeouts in CI/CD — Diagnose and fix timeout-related failures that interact with retry logic in automated pipelines
- Async LaTeX Compilation and Webhooks — Offload long-running jobs to async queues to avoid hitting synchronous rate limits entirely
- LaTeX API Authentication Best Practices — Secure API key management and rotation to ensure your retry logic always has valid credentials
\end{article}
\related{posts}




