On the 8th and 9th of August, two separate but simultaneous issues with the DigitalOcean platform occurred that caused a number of channels for submitting support tickets to become unavailable:
1) Emailing of tickets through email@example.com
For a period of nearly 22 hours, emails to firstname.lastname@example.org would not correctly flow through to our ticketing system. https://cloudsupport.digitalocean.com experienced a significant performance degradation to the point where it was unusable, and for a period of three and a half hours it was redirected to https://www.digitalocean.com/company/contact, our publicly available contact form.
The combination of these issues made it difficult for users to open tickets and reach our support team in a timely manner.
Timeline of Events
19:32 UTC - The inbox that DigitalOcean uses to process support emails was at capacity, preventing all tickets emailed to email@example.com from reaching our third party ticketing system
13:29 UTC - First known customer report of the issue
16:44 UTC - Issue escalated by DigitalOcean support to our operations team
17:00 UTC - A separate problem is identified with our support portal, causing a significant performance degradation
17:27 UTC - DigitalOcean engineering begins investigation in to the issue
18:08 UTC - DigitalOcean engineering formally identifies this as a major incident, a StatusPage is posted, incident response begins
18:13 UTC - Our support portal is redirected to point at https://www.digitalocean.com/company/contact, our public support contact form
18:31 UTC - A deletion task is run against the inbox
18:53 UTC - The inbox is expanded to allow for email to resume flowing, and the deletion task completes at around the same timestamp
19:06 UTC - Tickets are confirmed flowing again
19:24 UTC - The backlog of email appears to resume flowing
19:50 UTC - Status Page set to monitoring
20:23 UTC - cloudsupport.digitalocean.com restored to full availability, and the redirect is removed
20:25 UTC - StatusPage set to resolved, after Support confirms no ongoing issues
In mid-July of last year, DigitalOcean switched its primary support system from an in-house solution to a third party solution. Along with this came a number of integration challenges and technical knowledge gaps. The email issue identified above was caused due to the use of an unmonitored email inbox for programmatic flow of email. The engineering team responsible for this identified this as a regressive production email strategy long before this incident occurred. There is work in flight to replace the email inbox being used with an end to end system built for programmatic email flow that is highly observable. Once that work is complete there should never be a mailbox capacity issue with submission of support tickets via email again.
https://cloudsupport.digitalocean.com is hosted by a third party vendor and therefore falls largely out of our operational control. This is well known to us as issue that we would like to fix. A longer term roadmap item for the engineering teams that work on our support experience is to replace cloudsupport.digitalocean.com, which is a frontend to the third party vendor’s ticketing system, with a frontend that we can place under DigitalOcean’s operational control.
We know our users rely on us for responsive support, and we apologize for the inconveniences and delays this issue caused.