Performance & latency expectations
This page publishes the latency we target for the SnapSharp API across typical request shapes. It answers the question every engineer asks before integrating a third-party screenshot service: "How fast is it in practice?"
These numbers are baseline expectations, not a contractual guarantee. SnapSharp offers a formal uptime commitment only on Business and Enterprise plans — see the SLA for binding terms. Latency can spike when target sites are slow, when cold Chromium instances are recycled, or during incidents.
What you should expect
If you make a screenshot request against a reasonably-fast target site from a client in Europe or North America, we aim for:
- Cache-hit requests: sub-200ms at the p95. Effectively instant.
- Cache-miss requests on a 1920×1080 viewport: under 2 seconds at the p50, under 4 seconds at the p95.
- Full-page captures on heavy single-page apps: can reach 10+ seconds at the p95. These are the slowest shape.
If your usage profile consistently exceeds these targets by a wide margin, open a support ticket — we want to know.
Measurement methodology
- Where we measure from: synthetic k6 load tests running from a dedicated VPS in Frankfurt (DE). This is close-to-best-case latency — your own clients may see an additional 50-150ms of network RTT depending on geography.
- What we measure: end-to-end HTTP latency from the moment the request leaves the client until the full response body is received. This includes TLS handshake, TTFB, and image body download.
- Cache-hit vs cache-miss: we split the percentiles explicitly. Warming the cache before a benchmark gives unrealistically good numbers and we do not do this in our published figures.
- Sample size: each percentile below is computed from at least 1000 requests sampled across a 1-hour window during typical weekday load.
Latency per viewport & size
All measurements below are for format=png, wait_until=load, against a standard responsive target site. cache=true is the default.
| Request shape | Cache | p50 | p95 |
|---|---|---|---|
| 320×240 | hit | ~30 ms | ~150 ms |
| 1920×1080 | hit | ~50 ms | ~200 ms |
| 1920×1080 | miss (fresh capture) | ~1.8 s | ~3.5 s |
| 3840×2160 (4K) | miss | ~3.0 s | ~6.0 s |
| Full-page, viewport 1280×800, ~5000 px tall | miss | ~5.0 s | ~12.0 s |
A cold Chromium process inside our browser pool adds roughly 1-1.5 seconds to the first request that lands on it. After the pool is warm, consecutive cache-miss requests in the middle of the distribution run closer to 1.2 s for a 1920×1080 capture.
Latency per country
When you pass the country parameter (Growth+ plan), your request is routed through a residential proxy in the requested region. This adds a proxy handshake and a second TLS hop to the target site, so expect extra latency on top of the baseline cache-miss numbers above.
country value | Typical added latency | p95 added latency |
|---|---|---|
| (none — default Frankfurt egress) | 0 | 0 |
US, GB, DE, FR, NL | +100-300 ms | +500 ms |
CA | +150-350 ms | +600 ms |
JP, AU, BR, IN | +400-800 ms | +1200 ms |
So a 1920×1080 cache-miss capture with country=JP typically completes in 1.8 s + 0.4-0.8 s ≈ 2.2-2.6 s at the p50, and can reach 3.5 s + 1.2 s ≈ 4.7 s at the p95.
Residential proxy latency is the dominant cost of geo-routing. If you don't need the request to originate from a specific country, omit the country parameter and save the proxy hop entirely.
Cache hit rate
For dashboards and status pages that screenshot the same handful of URLs repeatedly, cache hit rate is the single biggest driver of observed latency.
- Target hit rate post-warm-up for repeated URLs: >50%.
- Default TTL: 3600 seconds (1 hour). Tune with
cache_ttlup to86400(24 hours). - What breaks cache hits: changing any parameter that is part of the cache key —
width,height,format,dark_mode,full_page,delay, custom headers, cookies, etc. If you're debugging a low hit rate, verify your client isn't sending a jitter timestamp or random query parameter.
See the caching guide for the complete cache-key definition and the X-Cache: HIT / X-Cache: MISS response header semantics.
What slows us down
The numbers above assume the target site cooperates. In practice, most slow captures are slow because of factors we don't fully control:
- Cold Chromium start. The first request to hit a freshly-recycled browser in the pool pays ~1.2 s of startup cost. Browsers are recycled every
BROWSER_RECYCLE_AFTERrequests to avoid memory leaks. This shows up as a thicker tail on the p95. - React / Vue / Svelte SPAs during hydration.
wait_until=loadfires before client-side routing and lazy components finish. Pages that rely on hydration can flash empty content without await_forselector. - Target-site latency. We wait for
networkidleby default on some endpoints — this is safe but slow when the target has analytics beacons that never quiet down. A slow target site is the single most common cause of 10+ second captures. - Ad-blocking rules. Enabling
block_adsorblock_trackersadds a few milliseconds of request interception per outbound network call. Usually invisible; noticeable only on ad-heavy pages. - Full-page captures on infinite-scroll sites. We walk the page height before capture. Pages with lazy-loaded content below the fold take proportionally longer. Use
full_page_max_heightto cap this.
How to make it faster
Practical knobs, ordered by impact:
- Turn on caching.
cache=trueis the default, but make sure your cache key is stable — don't send changing timestamps or session tokens in query params. - Drop to a smaller viewport.
1280×800captures are noticeably faster than1920×1080, which are in turn much faster than 4K. - Use
wait_until=domcontentloadedinstead ofnetworkidleon pages you know don't need full network quiet. This alone can cut 1-3 seconds off slow targets. - Pre-warm URLs. If you render a monthly report that screenshots the same 20 dashboards, ping them once at the top of the hour; subsequent real requests land on a warm cache and warm browser.
- Skip
full_pageunless you actually need it. Viewport-only captures are consistently 3-5× faster than full-page ones. - Don't set
countryif you don't need geo-routing. The proxy hop is pure overhead when your target doesn't serve geo-specific content. - Use
asyncendpoints for batch workloads (async screenshot, batch jobs) instead of synchronous requests with long timeouts. Frees your clients to do other work while we render.
Headers you can inspect
Every response exposes timing and cache info so you can measure from your own side:
X-Response-Time: 1247 # milliseconds, server-side
X-Cache: HIT | MISS
X-Request-Id: <uuid> # share this with support if a request is slowLog X-Response-Time and X-Cache in your own observability stack — compared against your client-side total, you'll see exactly how much of the latency is SnapSharp versus the network.
Related
- Caching — cache keys, TTL,
X-Cachesemantics. - Rate limits & plans — per-minute and monthly request caps.
- Error handling — 429s, 5xx retry strategy.
- SLA — contractual uptime commitment for Business and Enterprise.