On September 13th 2017, between 15:31 and 16:18 UTC, Voice API requests originating from the US and EMEA regions received HTTP 500 responses. The outage also affected inbound calls.
At 16:18 UTC a failover to one of our redundant data centers was performed, so that investigative and resolution work in the US data center could continue with minimal customer impact. Voice API services were run from the redundant data center until September 14th 2017 at 14:20 UTC. During this time customers might have experienced latency and occasional API requests failures.
On September 14th 2017 at 14:20 UTC the issue was fully resolved with service re-established from the US data center.
Due to an increase in Voice traffic in the US data center, one of the database cluster nodes run out of available memory. During the automatic traffic rebalancing process, the Voice API performance was impacted resulting in HTTP 500 responses.
During the time when Voice API was run from the redundant data center, customers with IP whitelisting in place for Voice API callbacks, NCCO and Audio file download requests, would not have received these callbacks because the relevant IPs were not published before the incident and therefore would not be present in customer firewall rules.
A program of work has been identified that includes: