Skip to main content

API Performance: Pooling and Parallelism

Industrial gauges and pressure meters on metal pipes
Apr 21, 20263 min readArchitecture, REST APIs, Playwright, Lighthouse, Monorepo, Nx, Security

Overview

Project: Performance and reliability pass across audit-api and job-api, two FastAPI services running on Railway behind a Next.js frontend

Role: Solo Developer

Duration: April 2026

Purpose: Eliminate per-request resource allocation, parallelize independent I/O operations, and harden authentication across both API services in the Nx monorepo

The problems

Both services had the same anti-pattern: creating expensive resources per request and running independent operations sequentially.

audit-api launched a fresh Chromium instance for every scan, waited for networkidle (which hangs on sites with persistent connections), and ran Lighthouse and axe-core sequentially even though they share no state. Some sites blocked the headless browser with bot detection, producing cryptic errors like ERR_HTTP2_PROTOCOL_ERROR.

job-api created a new httpx.AsyncClient per ATS fetch (tearing down and rebuilding TCP connections for every request), created a new Supabase client per dependency injection call, and ran four database operations sequentially per source during polling.

Security review also surfaced two gaps: JWT decode paths did not require exp or sub claims, and the ALLOWED_HOSTS configuration could silently default to permissive behavior.

Audit-api: browser pool and parallel scanning

I replaced per-scan browser launches with a persistent BrowserPool that keeps one Chromium process alive across scans. Each scan gets a fresh browser context for cookie and state isolation, then closes it. An idle timer shuts down the browser after 30 minutes of inactivity. On app shutdown, the FastAPI lifespan handler calls pool.shutdown().

Navigation changed from wait_until="networkidle" to wait_until="load" with a 10-second best-effort networkidle follow-up that never blocks. A realistic Chrome User-Agent header reduced bot detection triggers. When sites still blocked the scanner, a pattern-matching function mapped raw Playwright errors to user-friendly messages.

Lighthouse and axe-core now run in parallel via asyncio.gather with return_exceptions=True. Lighthouse runs as a subprocess; axe runs inside the Playwright page. They share nothing. The ScanQueue itself remains serialized because Lighthouse uses global performance marks that collide in concurrent runs.

lighthouse_task = asyncio.create_task(_run_lighthouse(url, device))
axe_task = asyncio.create_task(_run_axe_and_capture(url, device))
 
lighthouse_result, axe_results = await asyncio.gather(
    lighthouse_task, axe_task, return_exceptions=True
)

Job-api: connection reuse and DB parallelism

A shared httpx.AsyncClient singleton replaced per-request client creation. The pool limits are 20 max connections and 10 keepalive, with a 15-second timeout and automatic redirect following. All six ATS fetchers (Greenhouse, Lever, Ashby, Workday, SmartRecruiters, JSON-LD) share this pool. The client closes on app shutdown.

def get_http_client() -> httpx.AsyncClient:
    global _client
    if _client is None or _client.is_closed:
        _client = httpx.AsyncClient(
            timeout=15.0,
            limits=httpx.Limits(
                max_connections=20, max_keepalive_connections=10
            ),
            follow_redirects=True,
        )
    return _client

The Supabase client follows the same singleton pattern: initialized once at startup via the FastAPI lifespan, shared across all requests.

For the poller, I identified two phases of independent database operations. Phase 1 runs the upsert of new/updated jobs concurrently with fetching existing rows (to find stale jobs). Phase 2 runs the archive of stale jobs concurrently with updating the last_polled_at timestamp:

# Phase 1: upsert + existing query in parallel
upsert_resp, existing_resp = await asyncio.gather(
    asyncio.to_thread(upsert_query.execute),
    asyncio.to_thread(existing_query.execute),
)
 
# Phase 2: archive stale + update timestamp in parallel
await asyncio.gather(
    asyncio.to_thread(archive_query.execute),
    asyncio.to_thread(last_polled_query.execute),
)

The asyncio.to_thread calls are essential because supabase-py is a synchronous client. Without them, await does not yield the event loop and the operations run sequentially despite asyncio.gather.

Security hardening

JWT decode paths across all three auth strategies now require exp and sub claims:

payload = jwt.decode(
    token,
    s.admin_session_secret,
    algorithms=["HS256"],
    options={"require": ["exp", "sub"]},
)

The ALLOWED_HOSTS configuration now raises a RuntimeError at import time if unset, preventing the server from starting with permissive defaults.

Results

ChangeImpact
Persistent browser poolEliminated 2-4s cold start per scan
load + best-effort networkidleNo more hanging scans on persistent-connection sites
Parallel Lighthouse + axeTwo independent processes overlap instead of running sequentially
Shared httpx clientTCP connections reused across all ATS fetchers
Supabase singletonOne client per app lifetime, not per request
Phased asyncio.gather on DB ops4 serial DB round trips reduced to 2 parallel phases
JWT exp/sub requirementTokens without expiration or subject rejected

Takeaway

Pool expensive resources at the process level, isolate per-request state with lightweight contexts. Parallelize I/O operations that do not share mutable state. Verify that async actually means concurrent: a synchronous library wrapped in async def still blocks unless you use asyncio.to_thread.