major impact GitHub โœ“ Resolved

Disruption with Copilot Coding Agent Sessions

Started
2026-03-20 00:58 UTC
Resolved
2026-03-20 01:58 UTC
Duration
60 minutes
Date
2026-03-20

Incident Timeline

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 00:58 UTC

We are investigating reports of impacted performance for some GitHub services.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:00 UTC

We are seeing widespread issues starting and viewing Copilot Agent sessions. We understand the cause and are working on remediation.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

investigating 2026-03-20 01:26 UTC

We are rolling out our mitigation and are seeing recovery.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

resolved 2026-03-20 01:58 UTC

On March 19, 2026, between 01:05 UTC and 02:52 UTC, and again on March 20, 2026, between 00:42 UTC and 01:58 UTC, the Copilot Coding Agent service was degraded and users were unable to start new Copilot Agent sessions or view existing ones. During the first incident, the average error rate was ~53% and<br />peaked at ~93% of requests to the service. During the second incident, the average error rate was ~99%% and peaked at ~100%% of requests with significant retry amplification. Both incidents were caused by the same underlying system authentication issue that prevented the service from connecting to its<br />backing datastore.<br /><br />We mitigated each incident by rotating the affected credentials, which restored connectivity and returned error rates to normal. The mitigation time was 01:24. The second occurrence was due to an incomplete remediation of the first.<br /><br />We are implementing automated monitoring for credential lifecycle events and improving operational processes to reduce our time to detection and mitigation of issues like this one in the future.

โ† All GitHub incidents Other incidents on 2026-03-20