CloudBees Status: Resolved: Database service outage

Friday, 21 March 2014

Resolved: Database service outage

A problem with our infrastructure provider this evening caused problems on slave-less CloudBees database clusters during the nightly database backup routine. Customers with apps connecting to the affected databases were likely to experience database connection hangs and timeouts. If apps are not coded with proper timeouts and retries, the apps may need to be restarted to re-establish connections to the database.

This problem did not affect customers who are paying for dedicated databases with slave configurations or customers using our recently added ClearDB multi-tenant database clusters which are configured with redundancy by default.

This was a problem that we have never encountered before and identification/recovery of the underlying problem was slower than we aim for. We apologize for the downtime and are working to identify ways to avoid this specific problem and to improve our recovery time.