Around 20:15 UTC a core component of our network failed, taking a significant part of our storage cluster, and thus our platform down. By 20:30 things were back up.
As far as we can see, everything is back to performing as usual.
The impact seems to have been as follows:
- Hosting: some websites went down from ~20:15 to ~20:30 UTC, and might have been performing slower than usual for a few minutes after that.
- Email: emails sent/received during that period might have been delayed.
- Cloud: some VPS couldn't read/write from their disk from ~20:15 to ~20:30 UTC.
While it is still not clear to us what caused the hardware failure leading to this incident, we have identified a way to prevent the impact it had should it happen again. We will implement this early tomorrow.
We shall post a RFO (Reason For Outage) once we fully investigated the incident and planned all the remediation steps.