Info: Report on the outage of 13/05/2020
Dear customers,
Since we strive to maintain an open communication with our customers, we owe you a short report about our outage of last week.
On Tuesday 12/05/2020 at the end of the day we saw an issue with a card in our POP in Germany on our European fiber ring. Luckily all traffic is rerouted via the other side of the ring. One of our network engineers leaves the next day, Wednesday 13/05/2020, in the early morning to Germany to replace the defective card.
Between Tuesday night and Wednesday morning, 13/05/2020 Eurofiber had a planned fiber maintenance in Antwerp. To be clear, our fibers were not involved in this maintenance. But at 5h30 an Eurofiber engineer accidentally jams one of our fiber cables between a cabinet door. This caused a second cut/outage on our fiber ring. This double cut/outage resulted in a disconnection from our Belgian network and our locations in Amsterdam and Germany. As a big part of our traffic is transmitted to Amsterdam and Frankfurt, we didn’t have enough capacity anymore in Belgium to handle peak traffic.
At 9h06, the Eurofiber engineers located the issue with our jammed fiber and solved it. At that moment we were back online but issues occurred when we started to rebuild our (fiber) wavelengths towards Amsterdam and Germany. As traffic was growing, our internal links inside the Belgian network were oversaturated. This caused time-outs to websites for many of our DSL customers.
Our network engineer arrived at 12h20 in our German site. The card was replaced within half an hour. At 13h01 we were able to get the wavelengths up and running as it should and our European ring was again fully redundant. Almost 1 hour later, at 14h15 our BGP communications with the outside world went back to normal after being disturbed due to the high amounts of attempts to get the fiber wavelengths back up.
What will we do to prevent this in the future?
We will install a backup for our out of band management interface which manages our equipment for traffic transportation, so even when there are 2 cuts on the ring, we can reach them from outside our own networks. This would result in a shorter repair time and better overview in case of issues like this.
Once again, we apologize for any inconveniences caused.