Data Center Equipment Failure: Cost and Prevention Tips
Last summer, two organizations ran into a serious issue when Sears levied a lawsuit against the companies. The big box retailer stated that due to a data center failure which caused its e-commerce website to be down for an extended period of time, it lost a considerable amount of money.
Crain’s reported that Sears blamed the equipment manufacturer and data center maintenance firm for the $1.58 million it lost in consumer revenues. The first outage lasted five hours and fixing the internal facility issue took even longer, resulting in the data center running off its backup generators for more than a week.
The issue causing the outage was a failed uninterruptible power supply in the data center. Where only one of the four UPSs stopped working in the first outage, a few days later, three out of the four failed, causing yet another service interruption. The second outage cost Sears $630,000. The company also had to pay rental fees for a replacement backup generator after the first failed.
All told, the retailer lost a total of $2 million in revenues due to the two data center outages and repairs cost $2.8 million. This case is just one of many illustrating the issues connected with data center outages. Not only did the data center operator upset the business processes of its client, but the organization now faces legal action.
It is critical that data center operators work to bolster their facility reliability and come as close to 100 percent uptime as they can.
How Can Operators Prevent Outages?
According to data center research from the Ponemon Institute, three of the most common causes of data center outages are malware infections, problems resulting from human errors, and UPS failure, such as what happened with Sears’ operator. However, there are ways that facility managers can mitigate the risks associated with these three pain points.
When it comes to UPS equipment failure, redundancy is key. Data center operators should allocate resources to redundancy to prevent equipment failure, as this type of issue is inevitable within many data centers. Although some operators may scoff at this capital investment, it pales in comparison to what the company would have to spend in the event of an outage.
Human error, another main cause of service disruptions, can be prevented through increased data center automation. This approach takes a significant amount of human interaction out of the equation, ensuring that equipment automatically takes the correct actions.
Lastly, malware attacks – specifically distributed-denial-of-service attacks – can result in a data center outage. Increasing the internal security measures within the data center can prevent this issue, paying special attention to protecting the client experience.
Harnessing A Modular Approach
While these are the top three causes of outages, they are not the only issues that can result in service disruption. Sometimes, the facility simply does not have the internal capacity to support increasing customer needs. As systems struggle to direct and process rising data traffic levels, servers can overheat or simply fail.
Data center operators should also better plan for boosted internal traffic, and a modular data center approach can help in these regards. Instead of investing in new hardware as well as a new brick-and-mortar structure to house it in, the data center manager can utilize a modular data center to save real estate and operating costs.
Additionally, a modular data center can also be added to a data center campus to provide extra capacity for service spikes and failover redundancy. In the event that the primary facility experiences an outage, the most important traffic can be migrated to the modular data center, keeping processes going. Combined with a data center sustainability plan, the addition of a modular data center can be a beneficial investment.