On Friday, September 11th 2020 at 10:00, several of our customers experienced a timeout error. We had a challenge with a server during the night, and the root cause of this problem was that one of our IIS Server nodes were down and it couldn't handle the requests. The health check of the node gave us the wrong values, which told us that it was up and running, even if it was not. When we found the root cause, we resolved it by the restart of the server. The incident was closed at 15:56.
We have taken necessary actions to avoid this in the future by creating tasks to improve the health check of the nodes.