Menu
djangocelerypythontutorialscreenshot-api

Django + Celery Screenshot Pipeline Tutorial — Async Captures at Scale

SnapSharp Team·April 26, 2026·4 min read

Django apps that need screenshots usually fall into one of three patterns: a model field that should auto-populate when saved (post thumbnails), a periodic job that captures a list of URLs (monitoring), or a user-triggered async task ("generate preview" button). Celery handles all three. SnapSharp removes the Chromium-in-a-container half of the puzzle.

This tutorial wires them together. By the end you'll have a Django model with a screenshot_url field that auto-fills, a Celery task with retries and exponential backoff, and a periodic job that refreshes captures every 24 hours.

Prerequisites

  • Python 3.11+ and pip.
  • A Django 5 project with Celery configured. If not, pip install django celery redis.
  • A free SnapSharp API key from snapsharp.dev/sign-up.
  • Redis or RabbitMQ for the Celery broker.

Step 1: install the SDK

The Python SDK is on PyPI as snapsharp. It uses requests under the hood.

pip install snapsharp

Add the API key to your env (.env, direnv, systemd, whatever your team uses):

SNAPSHARP_API_KEY=sk_live_YOUR_API_KEY

Configure Django to read it. In settings.py:

import os

SNAPSHARP_API_KEY = os.environ['SNAPSHARP_API_KEY']
SNAPSHARP_TIMEOUT = int(os.environ.get('SNAPSHARP_TIMEOUT', '60'))

If the env var is missing, Django will fail at import time — better than a runtime KeyError two hours into a deploy.

Step 2: a service module for SnapSharp

Centralize all SnapSharp calls in one place. Create myapp/services/snapsharp.py:

from typing import Optional
from django.conf import settings
from snapsharp import SnapSharp, SnapSharpError

_client: Optional[SnapSharp] = None


def get_client() -> SnapSharp:
    global _client
    if _client is None:
        _client = SnapSharp(
            api_key=settings.SNAPSHARP_API_KEY,
            timeout=settings.SNAPSHARP_TIMEOUT,
        )
    return _client


def capture_screenshot(url: str, **opts) -> bytes:
    client = get_client()
    return client.screenshot(url, **opts)


def capture_og(template: str, variables: dict, **opts) -> bytes:
    client = get_client()
    return client.og_image(template=template, variables=variables, **opts)

Lazy initialization avoids reaching out to SnapSharp during manage.py imports (e.g., when running migrations).

Step 3: a Celery task with retries

Create myapp/tasks.py:

import logging
from celery import shared_task
from celery.exceptions import Retry
from django.core.files.base import ContentFile
from snapsharp import SnapSharpError

from .services.snapsharp import capture_screenshot
from .models import Site

logger = logging.getLogger(__name__)


@shared_task(
    bind=True,
    autoretry_for=(SnapSharpError,),
    retry_backoff=True,
    retry_backoff_max=300,
    retry_jitter=True,
    max_retries=5,
)
def capture_site_screenshot(self, site_id: int) -> str:
    site = Site.objects.get(pk=site_id)
    logger.info('Capturing screenshot for site %s (%s)', site_id, site.url)

    try:
        image_bytes = capture_screenshot(
            site.url,
            width=1280,
            height=720,
            format='png',
            block_ads=True,
            full_page=False,
        )
    except SnapSharpError as exc:
        # Don't retry on permanent failures
        if exc.status in (400, 401, 403, 404):
            logger.warning('Permanent failure for %s: %s', site.url, exc)
            site.screenshot_status = 'failed'
            site.save(update_fields=['screenshot_status'])
            return 'permanent_failure'
        raise

    filename = f'screenshots/site-{site_id}.png'
    site.screenshot.save(filename, ContentFile(image_bytes), save=False)
    site.screenshot_status = 'ready'
    site.save(update_fields=['screenshot', 'screenshot_status'])

    logger.info('Saved screenshot for site %s', site_id)
    return filename

Two important details:

  1. autoretry_for=(SnapSharpError,) retries on transient SnapSharp errors (5xx, 429). Combined with retry_backoff=True, Celery uses exponential backoff with jitter — playing nicely with rate limits.
  2. We manually check exc.status for permanent failures (400/401/403/404) and stop retrying. There's no point retrying a 400 invalid URL.

Step 4: the Site model

myapp/models.py:

from django.db import models


class Site(models.Model):
    SCREENSHOT_STATUS_CHOICES = [
        ('pending', 'Pending'),
        ('ready', 'Ready'),
        ('failed', 'Failed'),
    ]

    url = models.URLField(unique=True)
    title = models.CharField(max_length=200)
    screenshot = models.ImageField(upload_to='screenshots/', blank=True, null=True)
    screenshot_status = models.CharField(
        max_length=10,
        choices=SCREENSHOT_STATUS_CHOICES,
        default='pending',
    )
    last_captured_at = models.DateTimeField(blank=True, null=True)
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    def __str__(self):
        return f'{self.title} ({self.url})'

Run migrations:

python manage.py makemigrations
python manage.py migrate

Step 5: trigger captures via signals

Django signals let us hook into save events without scattering task.delay() calls across the codebase. Create myapp/signals.py:

from django.db.models.signals import post_save
from django.dispatch import receiver

from .models import Site
from .tasks import capture_site_screenshot


@receiver(post_save, sender=Site)
def schedule_screenshot(sender, instance, created, **kwargs):
    """Queue a screenshot capture whenever a Site is created or its URL changes."""
    if created:
        capture_site_screenshot.delay(instance.id)
        return

    # Optional: re-capture if URL changed
    if instance.tracker.has_changed('url'):
        capture_site_screenshot.delay(instance.id)

Wire it up in myapp/apps.py:

from django.apps import AppConfig


class MyappConfig(AppConfig):
    default_auto_field = 'django.db.models.BigAutoField'
    name = 'myapp'

    def ready(self):
        from . import signals  # noqa: F401

Now any code that creates a Site (admin, API, manage.py shell, fixture loading) automatically schedules a screenshot. The model layer doesn't need to know about Celery.

Step 6: a periodic task to refresh

Stale screenshots are a real problem. Sites change weekly. Use Celery Beat to refresh every 24 hours:

# myapp/tasks.py (add this)
from celery.schedules import crontab
from django.utils import timezone
from datetime import timedelta


@shared_task
def refresh_stale_screenshots():
    """Re-capture screenshots older than 24 hours."""
    cutoff = timezone.now() - timedelta(days=1)
    stale = Site.objects.filter(
        screenshot_status='ready',
        last_captured_at__lt=cutoff,
    )
    count = 0
    for site in stale.iterator():
        capture_site_screenshot.delay(site.id)
        count += 1
    return f'queued {count} captures'

Register the schedule in celery.py:

from celery import Celery
from celery.schedules import crontab

app = Celery('myproject')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()

app.conf.beat_schedule = {
    'refresh-screenshots-daily': {
        'task': 'myapp.tasks.refresh_stale_screenshots',
        'schedule': crontab(hour=3, minute=0),  # 03:00 UTC
    },
}

Run Celery Beat alongside your worker:

celery -A myproject worker --loglevel=info
celery -A myproject beat --loglevel=info

In production, supervisor / systemd / Docker handles process management.

Step 7: serving screenshots in views

A simple list view that displays site screenshots:

# myapp/views.py
from django.views.generic import ListView
from .models import Site


class SiteListView(ListView):
    model = Site
    template_name = 'myapp/site_list.html'
    context_object_name = 'sites'
    paginate_by = 20

    def get_queryset(self):
        return Site.objects.filter(screenshot_status='ready').order_by('-last_captured_at')

The template:

{% extends "base.html" %}
{% block content %}
<div class="grid">
  {% for site in sites %}
    <article>
      {% if site.screenshot %}
        <img src="{{ site.screenshot.url }}" alt="{{ site.title }}" loading="lazy" />
      {% endif %}
      <h3>{{ site.title }}</h3>
      <a href="{{ site.url }}" target="_blank" rel="noreferrer">{{ site.url }}</a>
    </article>
  {% endfor %}
</div>
{% endblock %}

If you're using S3 or Cloudflare R2 for storage (recommended for production), site.screenshot.url returns a public CDN URL automatically via django-storages.

Step 8: production patterns

Django admin integration

Show captures in the admin list view:

# myapp/admin.py
from django.contrib import admin
from django.utils.html import format_html
from .models import Site
from .tasks import capture_site_screenshot


@admin.register(Site)
class SiteAdmin(admin.ModelAdmin):
    list_display = ('title', 'url', 'screenshot_preview', 'screenshot_status', 'last_captured_at')
    list_filter = ('screenshot_status',)
    actions = ['recapture_screenshots']

    def screenshot_preview(self, obj):
        if obj.screenshot:
            return format_html('<img src="{}" style="max-height:60px" />', obj.screenshot.url)
        return '—'

    @admin.action(description='Re-capture selected screenshots')
    def recapture_screenshots(self, request, queryset):
        for site in queryset:
            capture_site_screenshot.delay(site.id)
        self.message_user(request, f'Queued {queryset.count()} captures')

Now staff can re-trigger captures from the admin without touching code.

Storing in S3

Add django-storages[boto3] and configure in settings.py:

INSTALLED_APPS += ['storages']
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
AWS_STORAGE_BUCKET_NAME = os.environ['AWS_STORAGE_BUCKET_NAME']
AWS_S3_REGION_NAME = 'eu-west-1'
AWS_QUERYSTRING_AUTH = False  # public URLs

Now screenshot.save() writes to S3, and screenshot.url returns a CDN-friendly URL.

Per-task timeout

Long pages with full_page=True can take 10+ seconds. Set a generous SDK timeout but a strict Celery soft limit so workers don't hang:

@shared_task(
    bind=True,
    soft_time_limit=90,
    time_limit=120,
    autoretry_for=(SnapSharpError,),
    retry_backoff=True,
    max_retries=5,
)
def capture_site_screenshot(self, site_id: int):
    ...

If the task exceeds 90s, Celery raises SoftTimeLimitExceeded so you can clean up. At 120s, the worker is killed.

Common pitfalls

Pitfall 1: blocking on screenshots in views. Don't call SnapSharp synchronously from a Django view. A 3-second screenshot blocks a worker thread that could be serving 100 other requests. Always queue via Celery.

Pitfall 2: rate limits on bulk imports. Importing 5,000 sites at once triggers 5,000 Celery tasks that all hit SnapSharp. The free tier (5 req/min) chokes. Throttle with apply_async(countdown=...) or use Celery's rate limiting: @shared_task(rate_limit='30/m').

Pitfall 3: missing update_fields. When the task saves the model, pass update_fields=['screenshot', 'screenshot_status'] to avoid triggering the post_save signal again — otherwise you'll loop forever.

Pitfall 4: storing binary in DB. Don't use a BinaryField for screenshots. Use ImageField with file storage (local, S3, R2). Postgres rows full of megabytes wreck performance.

Pitfall 5: signal during fixtures/migrations. When loading fixtures or running migrations, the post_save signal fires and tries to queue Celery tasks against a broker that may not be running. Guard with raw=True check: if kwargs.get('raw'): return.

Final code

Five files:

  • myapp/services/snapsharp.py — service wrapper.
  • myapp/tasks.py — Celery tasks with retry logic.
  • myapp/models.py — Site model with screenshot field.
  • myapp/signals.py — auto-trigger on save.
  • myapp/admin.py — admin integration.

About 150 lines total — most of it boilerplate Celery error handling, not SnapSharp logic.

Conclusion

Django + Celery is the gold-standard pattern for async background work in Python, and SnapSharp slots in cleanly. The SDK call is one line; everything else is the usual Celery hygiene — retries, timeouts, idempotency. Pair it with django-storages for S3 and you have a production thumbnail pipeline that scales.

Next steps: explore the screenshot API reference, read the FastAPI background task tutorial, or look at automating screenshots with Python.


Related: Python tutorial · Pricing · Webhooks for real-time notifications

Django + Celery Screenshot Pipeline Tutorial — Async Captures at Scale — SnapSharp Blog