Production API outage
Incident Report for Medable
Postmortem
Cause

A routine update in our deployment layer resulted in network connectivity issues between nodes. This resulted in services that were functioning normally to appear down to load balancing and proxy services. This ultimately resulted in the api being inaccessible externally.

Resolution

Upon identifying the cause, we worked to re-establish connectivity internally, restoring load balancing and proxying services.

Prevention

New maintenance processes have been put in place for the nodes in question. Preventive maintenance measures will take place on non-production nodes that will allow for testing and verification of the updates before they can impact production services. In production, these updates will then be applied during scheduled maintenance so that the impacts can be closely monitored.

Posted Aug 29, 2017 - 22:14 UTC

Resolved
The outage has been resolved. All production services are operational. We will report back with an analysis of the situation shortly.
Posted Aug 29, 2017 - 13:35 UTC
Investigating
Medable is experiencing an outage in the production api environment. The issue is under investigation, currently.
Posted Aug 29, 2017 - 10:35 UTC