Connection to cache service failed

Incident Report for Visma Cloud Services

Postmortem

Root cause

After analyzing both issues, we found the root cause. One of the API servers appeared to be in an unhealthy state and used a low CPU. Unfortunately, the load balancing system sent the customer requests to the unhealthy server.

Impact: 

In both incidents, some of our API customers experienced instability in our system.

Actions: 

  • We increased the API server capacity
  • Changed the load balancer configuration. 
  • Added extra alerts to be able to act more quickly

Current status:

We are continuously monitoring the servers now and they appear to be healthy since the changes

Posted Mar 29, 2023 - 13:57 CEST

Resolved

Two incidents happened on 16th March and 27th March for the same reason. Several of our API customers were affected. They are getting a "Connection to cache service failed" error. These incidents happened due to a problem in load balancing.
Posted Mar 27, 2023 - 14:30 CEST