LaTeX API Rate Limiting and Retry Logic

Handle rate limits and transient failures in LaTeX API calls — exponential backoff, jitter, request queuing, and plan-aware throttling.

Jun 14, 2026·11 min read·

guide api reliability

Production systems that compile LaTeX documents at scale will inevitably hit rate limits. Whether you're batch-processing thousands of academic papers, generating invoices on demand, or running a multi-tenant SaaS that calls the FormatEx compile endpoint, unhandled HTTP 429 responses mean lost compilations, cascading failures, and frustrated users. This guide covers every layer of the problem: reading the right response headers, implementing exponential backoff with jitter, building a client-side request queue, and choosing a plan that matches your actual throughput requirements.

Understanding HTTP 429 and the Retry-After Header

When you exceed your plan's rate limit, the FormatEx API returns a 429 Too Many Requests response. The response includes a Retry-After header that tells you the exact number of seconds to wait before retrying. Ignoring this header and immediately retrying is one of the most common bugs in API integrations — it hammers the server, prolongs your suspension window, and in some APIs can trigger permanent bans. For a complete reference of all error codes you may encounter, see the LaTeX API error codes guide.

A typical 429 response looks like this:

text

HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json

{"error": "rate limit exceeded, retry after 30 seconds"}

Always read Retry-After as the floor — never retry sooner. If the header is absent (which can happen with proxy-level throttling), fall back to a sensible default like 60 seconds.

There are two distinct situations that produce 429s in the FormatEx API:

Per-minute burst limits — sending too many requests in a short window. Wait the Retry-After duration and retry the same request.
Monthly compilation quota exhausted — your plan's monthly cap is depleted. A short wait won't help; you need to upgrade your plan or wait for the monthly reset.

Distinguish between these by reading the error body, not just the status code. The monthly quota error message will say so explicitly.

Exponential Backoff with Jitter

Retrying after a fixed delay is better than not retrying, but it still causes thundering-herd problems when many clients hit the same limit simultaneously and all wake up at the same second. Exponential backoff increases the wait time after each consecutive failure. Adding random jitter spreads the retry attempts across time, preventing synchronized retry storms.

The formula for the wait duration before attempt n (zero-indexed) is:

text

wait = min(base * 2^n, cap) + random(0, jitter_factor * base * 2^n)

Here's a working Python implementation for the FormatEx compile endpoint:

python

import time
import random
import httpx

def compile_latex(latex_source: str, api_key: str, engine: str = "pdflatex") -> bytes:
    url = "https://api.formatex.io/api/v1/compile"
    headers = {"X-API-Key": api_key, "Content-Type": "application/json"}
    payload = {"latex": latex_source, "engine": engine}

    base_delay = 1.0       # seconds
    max_delay = 64.0       # cap at ~1 minute
    max_attempts = 6
    jitter_factor = 0.5    # up to 50% random jitter

    for attempt in range(max_attempts):
        try:
            response = httpx.post(url, json=payload, headers=headers, timeout=120.0)

            if response.status_code == 200:
                return response.content

            if response.status_code == 429:
                retry_after = int(response.headers.get("Retry-After", 0))
                exponential = base_delay * (2 ** attempt)
                delay = min(max(retry_after, exponential), max_delay)
                delay += random.uniform(0, jitter_factor * exponential)
                print(f"Rate limited. Waiting {delay:.1f}s before attempt {attempt + 2}")
                time.sleep(delay)
                continue

            if response.status_code in (500, 502, 503, 504):
                # Transient server errors — also worth retrying
                delay = min(base_delay * (2 ** attempt), max_delay)
                delay += random.uniform(0, jitter_factor * delay)
                time.sleep(delay)
                continue

            # Non-retryable errors (400, 401, 403, 422)
            response.raise_for_status()

        except httpx.TimeoutException:
            if attempt == max_attempts - 1:
                raise
            delay = min(base_delay * (2 ** attempt), max_delay)
            time.sleep(delay)

    raise RuntimeError(f"Failed after {max_attempts} attempts")

The key decisions in this implementation:

Retry-After takes precedence over the exponential delay — we use max(retry_after, exponential) to respect the server's explicit guidance.
5xx errors are retried because they indicate transient server-side failures, not client mistakes.
4xx errors other than 429 are not retried — a 400 Bad Request means your LaTeX is malformed and retrying won't help.
Timeouts are retried with backoff because LaTeX compilation is CPU-intensive and timeouts can be transient under load. For more on timeout configuration by engine, see LaTeX compilation timeouts in CI/CD.

Node.js and Go Implementations

If your stack is Node.js, here's an equivalent using the native fetch API with async/await:

typescript

async function compileLaTeX(
  source: string,
  apiKey: string,
  engine: "pdflatex" | "xelatex" | "lualatex" | "latexmk" = "pdflatex"
): Promise<ArrayBuffer> {
  const url = "https://api.formatex.io/api/v1/compile";
  const maxAttempts = 6;
  const baseDelay = 1000; // ms
  const maxDelay = 64000; // ms

  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const response = await fetch(url, {
      method: "POST",
      headers: {
        "X-API-Key": apiKey,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ latex: source, engine }),
      signal: AbortSignal.timeout(120_000),
    });

    if (response.ok) {
      return response.arrayBuffer();
    }

    const isRetryable =
      response.status === 429 ||
      response.status === 500 ||
      response.status === 502 ||
      response.status === 503 ||
      response.status === 504;

    if (!isRetryable || attempt === maxAttempts - 1) {
      const body = await response.json().catch(() => ({}));
      throw new Error(`Compile failed [${response.status}]: ${body.error ?? "unknown"}`);
    }

    const retryAfterHeader = response.headers.get("Retry-After");
    const retryAfterMs = retryAfterHeader ? parseInt(retryAfterHeader) * 1000 : 0;
    const exponential = baseDelay * Math.pow(2, attempt);
    const jitter = Math.random() * exponential * 0.5;
    const delay = Math.min(Math.max(retryAfterMs, exponential) + jitter, maxDelay);

    await new Promise((resolve) => setTimeout(resolve, delay));
  }

  throw new Error("Max retry attempts exceeded");
}

For a full production-ready TypeScript integration including Next.js patterns, see LaTeX PDF generation in Node.js and TypeScript. For Go, which is what the FormatEx backend itself is written in, the same pattern with net/http:

package latex

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "math"
    "math/rand"
    "net/http"
    "strconv"
    "time"
)

type CompileRequest struct {
    LaTeX  string `json:"latex"`
    Engine string `json:"engine"`
}

func CompileWithRetry(source, apiKey, engine string) ([]byte, error) {
    const (
        maxAttempts = 6
        baseDelay   = time.Second
        maxDelay    = 64 * time.Second
    )

    client := &http.Client{Timeout: 120 * time.Second}
    body, _ := json.Marshal(CompileRequest{LaTeX: source, Engine: engine})

    for attempt := 0; attempt < maxAttempts; attempt++ {
        req, _ := http.NewRequest(http.MethodPost,
            "https://api.formatex.io/api/v1/compile",
            bytes.NewReader(body))
        req.Header.Set("X-API-Key", apiKey)
        req.Header.Set("Content-Type", "application/json")

        resp, err := client.Do(req)
        if err != nil {
            if attempt == maxAttempts-1 {
                return nil, fmt.Errorf("request failed: %w", err)
            }
            time.Sleep(jitteredDelay(attempt, baseDelay, maxDelay))
            continue
        }

        if resp.StatusCode == http.StatusOK {
            defer resp.Body.Close()
            return io.ReadAll(resp.Body)
        }

        resp.Body.Close()

        retryable := resp.StatusCode == 429 ||
            resp.StatusCode >= 500

        if !retryable || attempt == maxAttempts-1 {
            return nil, fmt.Errorf("compile failed with status %d", resp.StatusCode)
        }

        delay := jitteredDelay(attempt, baseDelay, maxDelay)
        if ra := resp.Header.Get("Retry-After"); ra != "" {
            if secs, err := strconv.Atoi(ra); err == nil {
                serverDelay := time.Duration(secs) * time.Second
                if serverDelay > delay {
                    delay = serverDelay
                }
            }
        }
        time.Sleep(delay)
    }

    return nil, fmt.Errorf("exceeded max attempts")
}

func jitteredDelay(attempt int, base, cap time.Duration) time.Duration {
    exp := float64(base) * math.Pow(2, float64(attempt))
    jitter := rand.Float64() * exp * 0.5
    d := time.Duration(exp + jitter)
    if d > cap {
        d = cap
    }
    return d
}

For a dedicated Go client guide with additional patterns, see calling the LaTeX API from Go.

Client-Side Request Queuing

When you're making many compilation requests concurrently — say, bulk PDF generation of 500 documents — throwing all requests at the API simultaneously is a recipe for immediate rate limiting. A client-side queue with controlled concurrency is far more effective than blanket retries.

The idea: maintain a pool of N concurrent workers, each pulling tasks from a queue, with the retry logic from above applied per-task. This prevents burst traffic from triggering rate limits in the first place.

Here's a TypeScript implementation using a simple async queue:

typescript

class CompileQueue {
  private queue: Array<() => Promise<void>> = [];
  private running = 0;

  constructor(private concurrency: number) {}

  async add(task: () => Promise<void>): Promise<void> {
    return new Promise((resolve, reject) => {
      this.queue.push(async () => {
        try {
          await task();
          resolve();
        } catch (err) {
          reject(err);
        } finally {
          this.running--;
          this.drain();
        }
      });
      this.drain();
    });
  }

  private drain() {
    while (this.running < this.concurrency && this.queue.length > 0) {
      const task = this.queue.shift()!;
      this.running++;
      task();
    }
  }
}

// Usage: 3 concurrent compilations at most
const queue = new CompileQueue(3);

const documents = [/* array of LaTeX source strings */];
const results = await Promise.allSettled(
  documents.map((source) =>
    queue.add(() =>
      compileLaTeX(source, process.env.FORMATEX_API_KEY!, "pdflatex")
        .then((pdf) => savePDF(pdf))
    )
  )
);

Keep concurrency at 3—5 for most plans. For the Scale plan with its higher monthly cap, you can push to 10—15 concurrent requests without triggering burst limits.

Choosing the Right Plan to Avoid Rate Limits

The most reliable way to avoid rate limits is to be on the plan that fits your actual usage. Retries and queuing are safety nets — not substitutes for adequate capacity.

Plan	Monthly Compilations	Engines	Best For
Free	15	pdflatex only	Evaluation, hobby projects
Developer ($12/mo)	100	All 4	Side projects, small apps
Pro ($49/mo)	500	All 4	Production apps, startups
Scale ($199/mo)	2,000	All 4	High-volume batch processing

A few practical rules for choosing:

Calculate your peak day, not your average. If you process 1,000 documents in the last three days of each month (deadline crunch), you need a plan that can handle 1,000 compilations, not 100.
Account for retries. Every retry attempt counts as a compilation against your quota. If your error rate is 5% and you retry up to 3 times, your effective quota consumption is closer to 115% of your successful compilation count.
Use xelatex or lualatex only when necessary. These engines are significantly slower than pdflatex. If you only need Unicode support, xelatex is fine; for pure English documents, pdflatex is faster and uses less of your per-request timeout budget. See the complete guide to LaTeX engines to understand the trade-offs.
Monitor usage in the dashboard. The FormatEx dashboard shows your monthly consumption. Set up alerts at 70% and 90% usage to give yourself time to upgrade before hitting the hard cap on the Free plan.

The hard block on the Free plan means compilations fail with 403 Forbidden (not 429) when the monthly quota is exhausted. Retrying won't help — you need to upgrade or wait for the monthly reset.

Testing Your Retry Logic

Don't wait for production incidents to validate your retry logic. Test it explicitly:

python

import unittest
from unittest.mock import patch, MagicMock
import httpx

class TestRetryLogic(unittest.TestCase):
    @patch("httpx.post")
    def test_retries_on_429(self, mock_post):
        # First two calls return 429, third returns 200 with PDF bytes
        rate_limited = MagicMock()
        rate_limited.status_code = 429
        rate_limited.headers = {"Retry-After": "1"}

        success = MagicMock()
        success.status_code = 200
        success.content = b"%PDF-1.4..."

        mock_post.side_effect = [rate_limited, rate_limited, success]

        result = compile_latex("\\documentclass{article}...", "test-key")
        self.assertEqual(result[:4], b"%PDF")
        self.assertEqual(mock_post.call_count, 3)

    @patch("httpx.post")
    def test_does_not_retry_on_400(self, mock_post):
        bad_request = MagicMock()
        bad_request.status_code = 400
        bad_request.headers = {}

        mock_post.return_value = bad_request
        bad_request.raise_for_status.side_effect = httpx.HTTPStatusError(
            "Bad Request", request=MagicMock(), response=bad_request
        )

        with self.assertRaises(httpx.HTTPStatusError):
            compile_latex("not valid latex at all", "test-key")

        # Should only have called once — no retry on 400
        self.assertEqual(mock_post.call_count, 1)

Resilient LaTeX API integrations aren't complicated, but they require intentional design. Read the Retry-After header, back off exponentially with jitter, queue requests to avoid burst limits, and pick a plan that matches your real throughput. Each of these layers adds compounding reliability.

If you're ready to start compiling LaTeX documents without the infrastructure headache, sign up at formatex.io — the Free plan gives you 15 compilations per month to validate your integration before committing to a paid tier.

LaTeX API Error Codes: Complete Guide — Understand every status code you may encounter, including 422 compile failures and 429 rate limits, to handle errors correctly in your retry logic
Bulk LaTeX PDF Generation via API — How to batch thousands of documents with concurrency controls and retry patterns, directly complementing the queuing strategies in this guide
LaTeX Compilation Timeouts in CI/CD — Diagnose and fix timeout-related failures that interact with retry logic in automated pipelines
Async LaTeX Compilation and Webhooks — Offload long-running jobs to async queues to avoid hitting synchronous rate limits entirely
LaTeX API Authentication Best Practices — Secure API key management and rotation to ensure your retry logic always has valid credentials

\end{article}

Back to blog

\related{posts}

draft $LaTeX API Authentication Best Practices$