Multiple services are affected, service degradation
- Started
- 2026-03-05 16:35 UTC
- Resolved
- 2026-03-05 19:30 UTC
- Duration
- 175 minutes
- Date
- 2026-03-05
Incident Timeline
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
We are investigating reports of degraded performance for Actions
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Actions is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
Webhooks is experiencing degraded availability. We are continuing to investigate.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We are observing delays in queuing Actions workflow runs. We’re still investigating the causes of these delays.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We have applied mitigations for connection failures across backend resources and we are observing a recovery in queueing Actions workflow runs.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
We are back to queueing Actions workflow runs at nominal rates and we are monitoring the clearing of queued runs during the incident.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
The queue of requested Actions jobs continues to make progress. Job delays are now approximately 6 minutes and continuing to decrease.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is now fully recovered.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Actions is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
Webhooks is operating normally.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.
On Mar 5, 2026, between 16:24 UTC and 19:30 UTC, Actions was degraded. During this time, 95% of workflow runs failed to start within 5 minutes with an average delay of 30 minutes and 10% workflow runs failed with an infrastructure error. This was due to Redis infrastructure updates that were being rolled out to production to improve our resiliency. These changes introduced a set of incorrect configuration change into our Redis load balancer causing internal traffic to be routed to an incorrect host leading to two incidents. <br /><br />We mitigated this incident by correcting the misconfigured load balancer. Actions jobs were running successfully starting at 17:24 UTC. The remaining time until we closed the incident was burning through the queue of jobs. <br /><br />We immediately rolled back the updates that were a contributing factor and have frozen all changes in this area until we have completed follow-up work from this. We are working to improve our automation to ensure incorrect configuration changes are not able to propagate through our infrastructure. We are also working on improved alerting to catch misconfigured load balancers before it becomes an incident. Additionally, we are updating the Redis client configuration in Actions to improve resiliency to brief cache interruptions.