Issues with API (US Region)
Incident Report for CometChat
Postmortem

Starting around 12:35pm MST on January 27, 2021, some customers started experiencing occasional errors and increased latency while using CometChat. Around 12:45pm MST there was a rapid increase in errors and CometChat wasn’t usable for most customers with apps hosted in our US region.

Around 12:45pm MST, we began the process of migrating our customers to a separate database cluster. From there, some customers started seeing improvements. By 1:35pm MST, our migration was complete and all customers were able to use CometChat again.

A root cause analysis revealed that, our backup policies coincided with an infrastructure issue that occurred at our cloud vendor's end. As a result, our I/O operations were suspended for an extended period of time. Internal monitoring tools at our cloud vendor's end were able to observe this behavior which eventually caused the underlying hosts to be replaced. While this operation was being performed, it caused a backlog of transactions which ultimately lead to the outage.

Our current priority is working alongside our cloud vendor and putting safeguards in place to prevent similar problems from happening again. We're truly sorry for the disruption.

Posted Jan 28, 2021 - 17:33 MST

Resolved
Our Engineering team has resolved the issue. If you continue to experience problems, please contact us. We apologize for any inconvenience.
Posted Jan 27, 2021 - 14:43 MST
Monitoring
Our Engineering team has implemented a fix to resolve the issue and is monitoring the situation. We will post an update as soon as the issue is fully resolved.
Posted Jan 27, 2021 - 13:35 MST
Identified
Our Engineering team has identified the cause of the issue. We will post an update as soon as additional information is available.
Posted Jan 27, 2021 - 12:35 MST
This incident affected: CometChat v2 (Client API (US)) and CometChat APIs (Rest API (US)).