Performance & latency expectations

This page publishes the latency we target for the SnapSharp API across typical request shapes. It answers the question every engineer asks before integrating a third-party screenshot service: "How fast is it in practice?"

These numbers are baseline expectations, not a contractual guarantee. SnapSharp offers a formal uptime commitment only on Business and Enterprise plans — see the SLA for binding terms. Latency can spike when target sites are slow, when cold Chromium instances are recycled, or during incidents.

What you should expect

If you make a screenshot request against a reasonably-fast target site from a client in Europe or North America, we aim for:

Cache-hit requests: sub-200ms at the p95. Effectively instant.
Cache-miss requests on a 1920×1080 viewport: under 2 seconds at the p50, under 4 seconds at the p95.
Full-page captures on heavy single-page apps: can reach 10+ seconds at the p95. These are the slowest shape.

If your usage profile consistently exceeds these targets by a wide margin, open a support ticket — we want to know.

Measurement methodology

Where we measure from: synthetic k6 load tests running from a dedicated VPS in Frankfurt (DE). This is close-to-best-case latency — your own clients may see an additional 50-150ms of network RTT depending on geography.
What we measure: end-to-end HTTP latency from the moment the request leaves the client until the full response body is received. This includes TLS handshake, TTFB, and image body download.
Cache-hit vs cache-miss: we split the percentiles explicitly. Warming the cache before a benchmark gives unrealistically good numbers and we do not do this in our published figures.
Sample size: each percentile below is computed from at least 1000 requests sampled across a 1-hour window during typical weekday load.

Latency per viewport & size

All measurements below are for format=png, wait_until=load, against a standard responsive target site. cache=true is the default.

Request shape	Cache	p50	p95
320×240	hit	~30 ms	~150 ms
1920×1080	hit	~50 ms	~200 ms
1920×1080	miss (fresh capture)	~1.8 s	~3.5 s
3840×2160 (4K)	miss	~3.0 s	~6.0 s
Full-page, viewport 1280×800, ~5000 px tall	miss	~5.0 s	~12.0 s

A cold Chromium process inside our browser pool adds roughly 1-1.5 seconds to the first request that lands on it. After the pool is warm, consecutive cache-miss requests in the middle of the distribution run closer to 1.2 s for a 1920×1080 capture.

Latency per country

When you pass the country parameter (Growth+ plan), your request is routed through a residential proxy in the requested region. This adds a proxy handshake and a second TLS hop to the target site, so expect extra latency on top of the baseline cache-miss numbers above.

`country` value	Typical added latency	p95 added latency
(none — default Frankfurt egress)	0	0
`US`, `GB`, `DE`, `FR`, `NL`	+100-300 ms	+500 ms
`CA`	+150-350 ms	+600 ms
`JP`, `AU`, `BR`, `IN`	+400-800 ms	+1200 ms

So a 1920×1080 cache-miss capture with country=JP typically completes in 1.8 s + 0.4-0.8 s ≈ 2.2-2.6 s at the p50, and can reach 3.5 s + 1.2 s ≈ 4.7 s at the p95.

Residential proxy latency is the dominant cost of geo-routing. If you don't need the request to originate from a specific country, omit the country parameter and save the proxy hop entirely.

Cache hit rate

For dashboards and status pages that screenshot the same handful of URLs repeatedly, cache hit rate is the single biggest driver of observed latency.

Target hit rate post-warm-up for repeated URLs: >50%.
Default TTL: 3600 seconds (1 hour). Tune with cache_ttl up to 86400 (24 hours).
What breaks cache hits: changing any parameter that is part of the cache key — width, height, format, dark_mode, full_page, delay, custom headers, cookies, etc. If you're debugging a low hit rate, verify your client isn't sending a jitter timestamp or random query parameter.

See the caching guide for the complete cache-key definition and the X-Cache: HIT / X-Cache: MISS response header semantics.

What slows us down

The numbers above assume the target site cooperates. In practice, most slow captures are slow because of factors we don't fully control:

Cold Chromium start. The first request to hit a freshly-recycled browser in the pool pays ~1.2 s of startup cost. Browsers are recycled every BROWSER_RECYCLE_AFTER requests to avoid memory leaks. This shows up as a thicker tail on the p95.
React / Vue / Svelte SPAs during hydration. wait_until=load fires before client-side routing and lazy components finish. Pages that rely on hydration can flash empty content without a wait_for selector.
Target-site latency. We wait for networkidle by default on some endpoints — this is safe but slow when the target has analytics beacons that never quiet down. A slow target site is the single most common cause of 10+ second captures.
Ad-blocking rules. Enabling block_ads or block_trackers adds a few milliseconds of request interception per outbound network call. Usually invisible; noticeable only on ad-heavy pages.
Full-page captures on infinite-scroll sites. We walk the page height before capture. Pages with lazy-loaded content below the fold take proportionally longer. Use full_page_max_height to cap this.

How to make it faster

Practical knobs, ordered by impact:

Turn on caching. cache=true is the default, but make sure your cache key is stable — don't send changing timestamps or session tokens in query params.
Drop to a smaller viewport. 1280×800 captures are noticeably faster than 1920×1080, which are in turn much faster than 4K.
Use wait_until=domcontentloaded instead of networkidle on pages you know don't need full network quiet. This alone can cut 1-3 seconds off slow targets.
Pre-warm URLs. If you render a monthly report that screenshots the same 20 dashboards, ping them once at the top of the hour; subsequent real requests land on a warm cache and warm browser.
Skip full_page unless you actually need it. Viewport-only captures are consistently 3-5× faster than full-page ones.
Don't set country if you don't need geo-routing. The proxy hop is pure overhead when your target doesn't serve geo-specific content.
Use async endpoints for batch workloads (async screenshot, batch jobs) instead of synchronous requests with long timeouts. Frees your clients to do other work while we render.

Headers you can inspect

Every response exposes timing and cache info so you can measure from your own side:

X-Response-Time: 1247     # milliseconds, server-side
X-Cache: HIT | MISS
X-Request-Id: <uuid>      # share this with support if a request is slow

Log X-Response-Time and X-Cache in your own observability stack — compared against your client-side total, you'll see exactly how much of the latency is SnapSharp versus the network.

Caching — cache keys, TTL, X-Cache semantics.
Rate limits & plans — per-minute and monthly request caps.
Error handling — 429s, 5xx retry strategy.
SLA — contractual uptime commitment for Business and Enterprise.