Runbook: Source Rate-Limiting (HTTP 429)¶
Impact: Data Latency & Potential Blocking
This alert fires when a specific data source is rate-limiting our requests. If not addressed, this can lead to significant data delays from that source and could escalate to a temporary or permanent IP block.
Triage Checklist (5 Minutes)¶
Your immediate goal is to identify the scope of the rate-limiting and assess the immediate impact.
-
Identify the Failing Source(s): Check the service logs to identify which domain(s) are returning
429status codes. -
Check Backoff Metrics: Examine the Prometheus metrics to see if the built-in backoff mechanism is handling the issue. A continuously rising count indicates the default retry logic is insufficient.
-
Inspect
Retry-AfterHeader: Check the logs for aRetry-Afterheader from the source. This is a clear directive that we must respect.
Remediation Steps¶
Follow these steps to mitigate the issue. Start with the least impactful change.
Step 1: Temporarily Disable the Profile¶
If a single profile is causing a high volume of 429 errors, the safest immediate action is to disable it. This stops all requests to the source and allows the situation to cool down.
-
Edit the Profile: Open the relevant JSON file in
scraper/profiles/(e.g.,scraper/profiles/problem-source.json). -
Set
enabledtofalse: -
Reload Profiles: Apply the change without restarting the service by calling the reload endpoint.
Step 2: Adjust Scraping Frequency¶
If the source is critical, a less drastic measure is to reduce the scraping frequency.
-
Edit the Profile's
schedule: In the profile's JSON file, change the cron string to be less frequent. For example, change from every 5 minutes to every 30 minutes. -
Reload Profiles: Apply the change by calling the reload endpoint.
Step 3: Long-Term Fix (Post-Incident)¶
To prevent recurrence, a permanent change to the provider or profile may be needed.
- Prefer RSS/API: If the profile uses a generic HTML scraper, investigate if the source provides an RSS feed or a public API. These are almost always more efficient and less likely to be rate-limited.
- Implement Provider-Specific Backoff: For high-value sources, consider creating a dedicated provider class that implements more sophisticated, source-specific rate-limiting logic.