Menu
pythonautomationflaskdjangogithub-actions

Automating Screenshots in Python Web Apps and CI/CD Pipelines

SnapSharp Team·March 5, 2026·4 min read

Taking a screenshot is easy. Taking screenshots reliably, at scale, inside a running web app or CI pipeline — that's a different problem. This guide covers production patterns: Flask and Django integrations, scheduled jobs, async batch processing, and GitHub Actions workflows.

Prerequisites

Install the SDK and set your key:

pip install snapsharp
export SNAPSHARP_API_KEY=sk_live_your_key_here

Pattern 1: Flask route that serves screenshots on demand

A common pattern is a Flask endpoint that accepts a URL and returns the screenshot directly — useful for preview thumbnails in dashboards or email campaigns.

# app.py
from flask import Flask, request, send_file, jsonify
from snapsharp import SnapSharp, AuthError, RateLimitError
import io
import os

app = Flask(__name__)
snap = SnapSharp(os.environ["SNAPSHARP_API_KEY"])

@app.route("/preview")
def preview():
    url = request.args.get("url")
    if not url:
        return jsonify({"error": "url is required"}), 400

    try:
        image_bytes = snap.screenshot(
            url,
            width=1280,
            height=720,
            format="jpeg",
            quality=85,
            cache=True,
            cache_ttl=3600,  # 1 hour cache — same URL returns instantly
        )
    except RateLimitError as e:
        return jsonify({"error": "rate_limit", "retry_after": e.retry_after}), 429
    except AuthError:
        return jsonify({"error": "invalid_api_key"}), 500
    except Exception as e:
        return jsonify({"error": str(e)}), 500

    return send_file(
        io.BytesIO(image_bytes),
        mimetype="image/jpeg",
        download_name="preview.jpg",
    )

if __name__ == "__main__":
    app.run(debug=True)

Usage:

GET /preview?url=https://example.com
 Returns JPEG image bytes

The cache=True parameter means repeated requests for the same URL hit Redis on the SnapSharp side — your Flask app doesn't pay for a browser render on every request.

Pattern 2: Django background task with Celery

For apps that can't block the HTTP response, offload the screenshot to a Celery worker.

# tasks.py
from celery import shared_task
from snapsharp import SnapSharp
from .models import SitePreview
import os

snap = SnapSharp(os.environ["SNAPSHARP_API_KEY"])

@shared_task(bind=True, max_retries=3, default_retry_delay=30)
def capture_site_preview(self, site_id: int, url: str) -> None:
    try:
        image_bytes = snap.screenshot(
            url,
            width=1280,
            full_page=False,
            format="webp",
            quality=80,
        )
        SitePreview.objects.filter(id=site_id).update(
            screenshot=image_bytes,
            status="ready",
        )
    except Exception as exc:
        raise self.retry(exc=exc)
# views.py
from django.http import JsonResponse
from .tasks import capture_site_preview
from .models import Site

def add_site(request):
    url = request.POST["url"]
    site = Site.objects.create(url=url, status="pending")
    capture_site_preview.delay(site.id, url)
    return JsonResponse({"id": site.id, "status": "pending"})

The task retries up to 3 times with a 30-second delay — handles transient network issues or browser pool saturation.

Pattern 3: Scheduled monitoring with APScheduler

Run screenshot jobs on a schedule without a full task queue setup:

# monitor.py
from apscheduler.schedulers.blocking import BlockingScheduler
from snapsharp import SnapSharp
from datetime import datetime
from pathlib import Path
import os

snap = SnapSharp(os.environ["SNAPSHARP_API_KEY"])

SITES = [
    {"name": "homepage",  "url": "https://yourapp.com"},
    {"name": "pricing",   "url": "https://yourapp.com/pricing"},
    {"name": "dashboard", "url": "https://yourapp.com/dashboard"},
]

def snapshot_all() -> None:
    date_str = datetime.now().strftime("%Y-%m-%d_%H-%M")
    out = Path("snapshots") / date_str
    out.mkdir(parents=True, exist_ok=True)

    for site in SITES:
        try:
            image = snap.screenshot(site["url"], width=1440, cache=False)
            path = out / f"{site['name']}.png"
            path.write_bytes(image)
            print(f"✓ {site['name']}{path}")
        except Exception as e:
            print(f"✗ {site['name']}: {e}")

scheduler = BlockingScheduler()
scheduler.add_job(snapshot_all, "cron", hour=9, minute=0)   # 09:00 daily
scheduler.add_job(snapshot_all, "cron", hour=18, minute=0)  # 18:00 daily
scheduler.start()

Run it as a long-lived process:

python monitor.py

Pattern 4: Async batch with aiohttp

When you have hundreds of URLs, sync requests serialize — one at a time, per rate limit window. Use aiohttp to batch concurrently while respecting limits:

# batch_async.py
import asyncio
import aiohttp
import os
from pathlib import Path

API_KEY = os.environ["SNAPSHARP_API_KEY"]
BASE_URL = "https://api.snapsharp.dev/v1/screenshot"
CONCURRENCY = 5  # Stay within rate limit (30 req/min on Starter = 0.5/s)

async def fetch_screenshot(
    session: aiohttp.ClientSession,
    sem: asyncio.Semaphore,
    url: str,
    output: Path,
) -> None:
    async with sem:
        params = {"url": url, "width": 1280, "format": "jpeg", "quality": 80}
        headers = {"Authorization": f"Bearer {API_KEY}"}
        async with session.get(BASE_URL, params=params, headers=headers) as resp:
            if resp.status == 200:
                output.write_bytes(await resp.read())
                print(f"✓ {url}")
            elif resp.status == 429:
                retry = int(resp.headers.get("Retry-After", 60))
                print(f"Rate limited — sleeping {retry}s")
                await asyncio.sleep(retry)
            else:
                print(f"✗ {url} → HTTP {resp.status}")

async def batch(urls: list[tuple[str, Path]]) -> None:
    sem = asyncio.Semaphore(CONCURRENCY)
    connector = aiohttp.TCPConnector(limit=CONCURRENCY)
    async with aiohttp.ClientSession(connector=connector) as session:
        tasks = [fetch_screenshot(session, sem, url, path) for url, path in urls]
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    out = Path("output")
    out.mkdir(exist_ok=True)
    targets = [
        ("https://example.com", out / "example.png"),
        ("https://github.com", out / "github.png"),
        ("https://stripe.com", out / "stripe.png"),
        # ... hundreds more
    ]
    asyncio.run(batch(targets))

For very large batches (500+ URLs), use the Batch endpoint instead — it processes in parallel server-side and returns a ZIP.

Pattern 5: GitHub Actions — screenshot on deploy

Capture your site after every production deploy to detect visual regressions:

# .github/workflows/screenshot-after-deploy.yml
name: Post-deploy screenshot

on:
  workflow_run:
    workflows: ["Deploy"]
    types: [completed]

jobs:
  screenshot:
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Python deps
        run: pip install snapsharp requests

      - name: Capture screenshots
        env:
          SNAPSHARP_API_KEY: ${{ secrets.SNAPSHARP_API_KEY }}
        run: python scripts/capture_deploy_screenshots.py

      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: deploy-screenshots-${{ github.run_number }}
          path: screenshots/
          retention-days: 30
# scripts/capture_deploy_screenshots.py
from snapsharp import SnapSharp
from pathlib import Path
import os

snap = SnapSharp(os.environ["SNAPSHARP_API_KEY"])
out = Path("screenshots")
out.mkdir(exist_ok=True)

PAGES = [
    ("home",     "https://yourapp.com"),
    ("pricing",  "https://yourapp.com/pricing"),
    ("docs",     "https://yourapp.com/docs"),
    ("sign-up",  "https://yourapp.com/sign-up"),
]

for name, url in PAGES:
    image = snap.screenshot(url, width=1440, height=900, cache=False)
    (out / f"{name}.png").write_bytes(image)
    print(f"Captured {name}")

Screenshots are uploaded as workflow artifacts — viewable in the GitHub UI for 30 days. Perfect for before/after comparisons.

Pattern 6: Django management command

For one-off bulk jobs (initial import, backfill), a management command beats a script:

# management/commands/capture_previews.py
from django.core.management.base import BaseCommand
from snapsharp import SnapSharp
from yourapp.models import Page
import os

class Command(BaseCommand):
    help = "Capture screenshots for all published pages"

    def add_arguments(self, parser):
        parser.add_argument("--force", action="store_true", help="Re-capture even if preview exists")

    def handle(self, *args, **options):
        snap = SnapSharp(os.environ["SNAPSHARP_API_KEY"])
        qs = Page.objects.filter(published=True)
        if not options["force"]:
            qs = qs.filter(screenshot="")

        self.stdout.write(f"Capturing {qs.count()} pages...")

        for page in qs:
            try:
                img = snap.screenshot(page.url, width=1280, format="webp", quality=80)
                page.screenshot = img
                page.save(update_fields=["screenshot"])
                self.stdout.write(f"  ✓ {page.url}")
            except Exception as e:
                self.stderr.write(f"  ✗ {page.url}: {e}")
python manage.py capture_previews
python manage.py capture_previews --force  # Re-capture all

Choosing the right pattern

ScenarioPattern
Preview endpoint in a web appFlask/Django route with cache=True
Non-blocking screenshot on user actionCelery task
Daily monitoringAPScheduler cron
Bulk import (500+ URLs)Async aiohttp or Batch endpoint
Visual regression after deployGitHub Actions
One-off backfillDjango management command

Frequently Asked Questions

Should I use Celery or APScheduler for scheduled screenshots in Django?

Use Celery if you already have a broker (Redis/RabbitMQ) and need retry logic, priorities, or distributed workers. Use APScheduler for simpler setups where a single long-running process is acceptable and you don't need distributed coordination. For ad-hoc one-offs, a Django management command beats both.

How do I handle rate limits in async Python screenshot batches?

Cap concurrency with asyncio.Semaphore below your plan's per-minute rate limit (e.g. 5 concurrent for Starter's 30/min). On HTTP 429, honor the Retry-After header with await asyncio.sleep(retry). Catch RateLimitError at the SDK level — it exposes retry_after as a number.

Can I stream the screenshot response to avoid buffering in memory?

Yes — use aiohttp's streaming API (response.content.iter_chunked()) to write to disk as bytes arrive. For Flask, use send_file with a generator. This matters for full_page=true on long pages where payloads can reach several MB.

What's the difference between cache=True and cache_ttl?

cache=True enables SnapSharp's server-side Redis cache (on by default). cache_ttl (in seconds, max 3600) controls how long the cached response survives. Same URL + params = instant cache hit, skipping the browser entirely. For preview endpoints, set cache_ttl=3600; for monitoring runs where you need fresh captures, set cache=False.

Does the Python SDK support async calls natively?

The official snapsharp package is sync. For async, either (a) wrap calls in asyncio.to_thread(snap.screenshot, url) or (b) call the HTTP API directly with aiohttp as shown in Pattern 4 — this avoids thread pool overhead and gives you fine-grained concurrency control.

How do I store screenshots directly in S3 from Python?

Two options: (1) pass upload_to_s3: true with your S3 credentials configured in the SnapSharp dashboard — the API uploads for you and returns an X-S3-URL header, or (2) download the bytes and use boto3 to upload yourself. Option 1 avoids the double hop through your server. See S3 storage setup.


Related: Screenshot API reference · Python SDK · Batch screenshots · Website Screenshots in Python (3 methods)

Automating Screenshots in Python Web Apps and CI/CD Pipelines — SnapSharp Blog