Much like any war, managing data centers is filled with long stretches of boredom punctuated by a few moment of sheer terror. To make responding to unexpected events inside the data center more routine, IT organizations need to have an incident response plan in place that becomes almost second-nature to the organization to implement.
At the Data Center World conference in National Harbor, Maryland, this September, Greg Ramsey, a strategist that works within Dell data center environments, will talk about the importance of a disciplined approach to managing IT incidents. It is a classic example where beeing able to apply an ounce of prevention is worth much more than any proverbial pound of IT cure.
For example, Ramsey said, keeping track of what systems have been affected by a particular issue makes it simpler for the IT organization to identify patterns that not only resolve issues faster, but in many cases prevent the issue from actually occurring a second or third time all together. In addition, while unavoidable issues may still occur, having an incident-response plan is critical to making sure the individual with the right IT skill is available to address the problem.
In a world where IT systems are more distributed than ever, Ramsey noted, it can be difficult to fix an issue if the organization doesn’t know who has the required IT expertise and, just as importantly, where those people are physically located in relation to the actual problem.
“It’s really about getting the right ticket to the right person as fast as possible,” he said. “You need to be able to get somebody to the fix the problem where it geographically lies.”
Having an incident response plan, Ramsey added, is also the key to any usage of IT automation to solve a problem. While someone can write a script to automatically fix an issue, the core information needed to apply that script in a way that does more good than harm usually relies on a knowledge base that forms the foundation for any incident response system.
All told, identifying the root cause any IT problem is only the first step toward remediation. The key to data center management success is making sure the knowledge associated with how to identify and resolve that problem becomes part of the institutional memory of the entire IT organization as a whole.
For more information, sign up for Data Center World National Harbor, which will convene in National Harbor, Maryland, on September 20-23, 2015, and attend Greg Ramsey’s session titled “Automated Incident Remediation—React to Issues Faster Than a Human!”