Render RCA - 2023-12-11
Starting at 2023-12-11 20:33 UTC a Render customer in the Oregon region was subject to a Distributed Denial of Service (DDoS) botnet attack. Render uses Cloudflare to offer DDoS protection and Cloudflare immediately began blocking some but not all of the malicious traffic. The remaining traffic overloaded some components of our infrastructure. The attack ended 15 minutes later and within 8 minutes of that, all of our systems had self-recovered and began serving traffic as normal.
This outage impacted infrastructure that serves roughly a quarter of our customers in the Oregon region. The web services for these customers were unable to receive traffic responding with a 5xx
HTTP errors. The total degraded service window was 23 minutes (2023-12-11 20:33 UTC to 2023-12-11 20:56 UTC), with the acute period (when no traffic was being served) for 11 minutes (2023-12-11 20:39 UTC to 2023-12-11 20:50 UTC).
Our Cloudflare configuration did not fully mitigate a DDoS attack against one of our customers. Additionally, parts of our infrastructure were not able to handle the dramatically increased load.
In the short term, we leveraged Cloudflare's filtering tool to guard against the resumption of the same attack.
We are also working to improve our system's responsiveness to this class of problem generally. Namely, we will be: