Event Processing Issues
Incident Report for DigitalOcean

The Incident

At 08:57 UTC on October 13th, 2017, our Support team began seeing errors with starting the Droplet console, attaching/detaching volumes, and login device verification. Our teams eventually determined this was the result of a failing auditing subsystem. Our engineers worked to bring the subsystem back to a normal state with a patch to ensure that internal auditing failures would not cause the entire operation to be rejected. At 11:30 UTC, the fix was pushed to our production systems and the support team confirmed affected operations were then operational.

Timeline of Events

08:57 UTC - Support team detects operation failures

09:40 UTC - We identify that the internal auditing subsystem was failing

10:30 UTC - Implement patch to resolve the issue

11:30 UTC - Patch deployment is completed

11:33 UTC - Support team confirms affected actions are now operational

Future Measures

We are working on increasing the reliability of the aforementioned auditing subsystem. Meanwhile, we have developed a patch to ensure users’ requests will succeed regardless of the operational status of the auditing subsystem.

In Conclusion

We’re disappointed we have let a non-essential sub-system affect user operations and we apologize for the inconveniences and frustrations it has caused.

Posted 4 days ago. Oct 18, 2017 - 20:16 UTC

Resolved
Our engineering team has resolved the issue impacting event processing. If you continue to experience issues with processing events please open a ticket with our Support team and we would be happy to assist you.
Posted 9 days ago. Oct 13, 2017 - 12:26 UTC
Monitoring
We have isolated the cause of the event processing issue and are currently monitoring. Events should be proceeding as normal; if you're still experiencing issues, please open a ticket with our support team.
Posted 9 days ago. Oct 13, 2017 - 11:41 UTC
Update
Our engineering team is continuing to work through actions we believe will resolve the event processing issue. We appreciate your patience and will provide additional updates soon.
Posted 9 days ago. Oct 13, 2017 - 11:22 UTC
Identified
Our engineering team has isolated the issue impacting event processing and is actively working to resolve. We apologize for the issue and will provide an update soon.
Posted 9 days ago. Oct 13, 2017 - 10:12 UTC
Update
Our engineering team continues to investigate the event processing issues causing delays during console access, creates, volume and DNS related events, and login issues during the device verification process. We will provide additional updates as more information becomes available.
Posted 9 days ago. Oct 13, 2017 - 09:36 UTC
Update
Our engineering team is still investigating the event processing issues. During this time you may experience delays during console access, creates, volume and DNS related events. You may also experience login issues during the device verification process. We will continue to update you, and apologize for the inconvenience.
Posted 9 days ago. Oct 13, 2017 - 08:33 UTC
Investigating
Our engineering team is actively investigating issues with event processing. During this time you may experience delays during creates, destroys and power events. We will keep you updated as we have more information. We apologize for any inconvenience this causes.
Posted 9 days ago. Oct 13, 2017 - 07:35 UTC
This incident affected: Regions (Global) and Services (Event Processing).