Skip to content

Runbook: Source Rate-Limiting (HTTP 429)

Impact: Data Latency & Potential Blocking

This alert fires when a specific data source is rate-limiting our requests. If not addressed, this can lead to significant data delays from that source and could escalate to a temporary or permanent IP block.

Triage Checklist (5 Minutes)

Your immediate goal is to identify the scope of the rate-limiting and assess the immediate impact.

  1. Identify the Failing Source(s): Check the service logs to identify which domain(s) are returning 429 status codes.

    docker compose logs --tail=200 scraper | grep "429"
    

  2. Check Backoff Metrics: Examine the Prometheus metrics to see if the built-in backoff mechanism is handling the issue. A continuously rising count indicates the default retry logic is insufficient.

    # Hypothetical PromQL query
    rate(scraper_http_retries_total{status_code="429"}[5m])
    

  3. Inspect Retry-After Header: Check the logs for a Retry-After header from the source. This is a clear directive that we must respect.

    # Look for log entries like: "Received 429 from example.com, respecting Retry-After header of 60 seconds"
    docker compose logs scraper | grep "Retry-After"
    


Remediation Steps

Follow these steps to mitigate the issue. Start with the least impactful change.

Step 1: Temporarily Disable the Profile

If a single profile is causing a high volume of 429 errors, the safest immediate action is to disable it. This stops all requests to the source and allows the situation to cool down.

  1. Edit the Profile: Open the relevant JSON file in scraper/profiles/ (e.g., scraper/profiles/problem-source.json).

  2. Set enabled to false:

    scraper/profiles/problem-source.json
    {
      "name": "problem-source",
      "enabled": false, // <-- Change this to false
      ...
    }
    

  3. Reload Profiles: Apply the change without restarting the service by calling the reload endpoint.

    curl -X POST http://localhost:9001/profiles/reload
    

Step 2: Adjust Scraping Frequency

If the source is critical, a less drastic measure is to reduce the scraping frequency.

  1. Edit the Profile's schedule: In the profile's JSON file, change the cron string to be less frequent. For example, change from every 5 minutes to every 30 minutes.

    scraper/profiles/problem-source.json
    {
      ...
      "schedule": "*/30 * * * *", // Changed from "*/5 * * * *"
      ...
    }
    

  2. Reload Profiles: Apply the change by calling the reload endpoint.

    curl -X POST http://localhost:9001/profiles/reload
    

Step 3: Long-Term Fix (Post-Incident)

To prevent recurrence, a permanent change to the provider or profile may be needed.

  • Prefer RSS/API: If the profile uses a generic HTML scraper, investigate if the source provides an RSS feed or a public API. These are almost always more efficient and less likely to be rate-limited.
  • Implement Provider-Specific Backoff: For high-value sources, consider creating a dedicated provider class that implements more sophisticated, source-specific rate-limiting logic.