Over the last 30 years, information technology has evolved from supporting the business, to aligning with the business, to being the business. When your applications, email or websites stop working, your business stops, too.
Those outages quickly get expensive, whether you’re talking about “just” lost sales or regulatory fines or damage to your brand and reputation. A recent survey of IT professionals pegged the average cost of unplanned downtime at $8,662 per minute, with the IT Process Institute reporting an average reported incident length of 90 minutes.
Of that total mean time to repair (MTTR), Forrester Research says as much as 70 percent is made up of mean time to know (MTTK). That’s how long it takes engage all responders and ensure all the proper stakeholders have been informed and are able to start collaborating. In fact, we’ve talked to organizations that can take 45 minutes just to get the right people on a conference call.
Here a few of the roadblocks that are relatively easy to tackle, and will put you on the road to faster, more efficient and more effective incident response.
When every team member is constantly barraged with the same alerts that go to everyone else, it’s all too easy for them to tune out the important ones.
- Solution: Hyper-target alerts based on who is responsible for the affected system(s)/application(s), which skills are needed, and who’s available (on-call/in office.) Use an IT alerting system that integrates with your IT stack or ITSM system and on-call schedules so you can automate and target who gets which alerts based on rules-based criteria and customizable templates.
No Process, No Results
Do you have a formal incident response plan that lays out who needs to do what for any type of incident? If not, you’re guaranteeing confusion and wasted time and effort in the middle of future emergencies.
- Solution: Invest the time to create and communicate a response plan that includes (among other things):
- Which individuals or teams need to be contacted for various types of incidents. For example, if a certain pattern of issues arise, do you need your middleware team or your database administrators on the case?
- Which individuals or teams should be contacted in the day time, overnight or on weekends?
- The right channels to ensure the right people are contacted at different times of the day. For example, if it’s the middle of the night in India or the Philippines, you’ll need a phone call rather than an email to be sure the message gets through.
Global Communications Woes
When your support teams “follow the sun” across time zones and geography, your emergency communications must be truly global as well. But in some countries, text messages are exorbitantly expensive or may show up as spam, or local weather or power outages may block emergency messages.
- Solution: Understand the power, telecommunications, and other infrastructure on which your global teams rely and eliminate known barriers to effective communications. Ensure your international communications support global delivery with local call routing, support for local caller ID, dedicated long and short codes for improved international SMS delivery and prompts for multiple languages. Optimize mobile support apps for users with limited or expensive cellular connectivity, and develop service relationships with local telecom providers to ensure effective delivery of emergency measures. Leveraging a multi-modal approach that utilizes multiple communication channels can also help guarantee message delivery.
Conference Bridge Chaos
We’ve all been there: Invites that don’t include the right log-in information or scrambling to find a pen and paper to scribble an access code in the middle of an emergency. All this hassle wastes too much time when every second counts.
- Solution: Make one-click conference bridge functionality a “must have” in your messaging platform to assure all the right team members can join a call directly from an alert.
These are only a few of the preventable problems that can delay the mean time to know about, and thus the mean time to repair, the security and operational problems that can cripple your business. Download this white paper to learn more ways to eliminate the communication bottlenecks that can cripple your IT incident response.