At 12:14 UTC on 10th October 2019, we began to have a Whatsapp outage. We failed outbound Whatsapp messages with "Undeliverable" status delivery receipts and stopped processing inbound messages so they began to queue on Whatsapp's side. By 12:54 we'd made changes which fixed this for some of our users. This issue was completely fixed at 13:34, outbound Whatsapp messages began to be delivered again, and the inbound messages that were queuing were processed and delivered.
The deletion of the legacy WhatsApp cluster management infrastructure accidentally resulted in the removal of shared permission objects. These missing permission objects caused our current WhatsApp cluster management to stop serving traffic.
We're implementing stricter processes which should prevent changes like this from causing outages in the future.