The internal service which handles Rules, Hooks, Customer Database Scripts, and Extensions in our US region experienced a spike in errors between 18:35 UTC and 18:50 UTC. This might have affected authentication requests if you were using Rules, Hooks, or Custom DB Scripts. Extensions might have displayed unavailability or experienced increased latency if you were using them during this period of time. This was caused by all the traffic shifting from to a previous versioned cluster with lower capacity.
We apologize for any issues you might have encountered, and thank you for your continued trust in Auth0.
Our internal service for handling extensibility features, including Rules, Hooks, Custom Database Scripts and Extensions has a routing layer which handles traffic shifting and blue-green deployments. During a deployment, it’s common that we might have two clusters of this service behind the routing layer at the same time and they might have different capacities.
At 6:32 PM UTC, when we deployed a new release of the routing layer, the traffic shifted from the latest version of the extensibility service with more capacity, to the previous one with less capacity.
Operators were alerted about the problem, and once the problem was identified, the traffic was immediately shifted back to the latest extensibility service cluster, with prescaled capacity. Once traffic was 100% back to the latest cluster, the incident was completely resolved.
18:32 UTC - Rollout of internal extensibility service routing layer performed.
18:35 UTC - Automated alerts started firing.
18:44 UTC - Operator shifted traffic back to the new extensibility service cluster.
18:52 UTC - Service was back to nominal state.