Major Incident Handling Process Design – Part Two

This post is the part two (and concluding part) of a series where we discuss the Major Incident Handling process and how to put one together. Previously we discussed the elements and considerations that should go into the process design. In this post, I have elaborated some of those considerations further with a sample process flow and a corresponding process design.

Sample Major Incident Handling Process Flow

Sample Major Incident Handling Process Design

A major incident generally imposes higher impact and requires special attention to resolve it. To summarize, I think an effective Major Incident Handling process design should clearly define at least the following who-does-what-by-when-and-how elements:

  • What constitutes a major incident in your organization? What criteria do you use to quickly and effectively determine and declare a major incident?
  • Who is accountable for coordinating and controlling the activities during a major incident exercise? The Major Incident Manager role can be fulfilled by a person or by a team, and she needs the proper authority to direct the activities and the people who are involved.
  • How the resolution efforts will be coordinated and conducted? The exact details may vary from one organization to another, or even from one incident to another. The general approach should be worked out beforehand, and the Major Incident Manager should be trained to utilize the approach as consistently as possible.
  • What escalation or communication approach will be used during and after the Major Incident?
  • What metrics will be used to measure the effectiveness of the process? Keep them simple, easily understood and reasonably painless to collect the data.
  • What format of communication and reporting will be used for the major incident? Who will get what type of information? Try to keep the contents appropriate for the intended audience.

I hope the information presented so far has been helpful. Please feel free to suggest options or other approaches that have worked for your organization.

Links to other posts in the series

Fresh Links Sundae

Fresh Links Sundae encapsulates some pieces of information I have come across during the past week. They maybe ITSM related or not entirely. Often they are from the people whose work I admire, and I hope you will find something of value.

While ITIL is still the framework of choice for many infrastructure and operations organizations, Stephen Mann suggested that ITIL plus another framework or strategy is needed. USMBOK may be the second framework used in conjunction with ITIL. It’s Time To Realize That “ITIL Is Not The Only Fruit” (Forrester Blogs » Stephen Mann)

CEO of a large retail chain visited Bob Sutton’s Stanford class and talked about the customer service warning signs that can be hard to tell from operational statistics alone. Perhaps we in IT also need to develop similar metrics for our customer service efforts? Greetings and Bathrooms: One CEO’s Metrics for Retail Stores (Bob Sutton)

Daniel Burrus discussed some approaches to transform and save the U.S. manufacturing sector. I believe all of the suggested approaches can also be used for the continual improvement effort in IT. How to Save the Manufacturing Sector (Strategic Insights Blog)

Jeff Wayman suggested five common areas where a Service Desk could automate, in turn increasing productivity for both the customers and the staff. Free Weekends for the 24-Hour Help Desk (ITSM Lens)

Marshall Goldsmith talked about how the action of commitment to quality speaks much louder than common buzzwords such as “empowerment” or “customer delight.” Putting Quality on the Line (Marshall Goldsmith)

Gina Smith discussed that, while all the breathless coverage of the consumerization trend taking place, one recent study maybe debunking some of the myths. Five consumerization of IT myths debunked, maybe (TechRepublic)

Something Seth Godin said in this blog got me thinking — the same situation can also be said for IT. We have been operating IT a certain way for a long time. Maybe it is time to take some fresh thinking or different approaches. The map has been replaced by the compass (Seth Godin’s Blog)

Robert “Transformed” Stroud discussed how the increasing business’ involvement in selecting/using information technologies will force some transformational influence upon IT. IT will transform or be transformed (CA on Service Management)

Charles Betz discussed how using enterprise architecture techniques can help to develop an understanding of an IT management system and how it needs to evolve. Too Many Tools: Integrating a New IT Management System (Charles Betz)

Because this is the Academy Awards weekend, Wired Magazine gave some instruction on how to throw a geek Oscars party. Throw a Geek Oscars Party (Wired)

Credit: Image Courtesy of Wikipedia

Major Incident Handling Process Design – Part One

In IT, incidents as a result of technology failure or human error can strike at any moment. Occasionally, we can have an incident that has a wide impact and poses serious risks to the business operations. Those major incidents need to be handled swiftly, so the IT service can be restored quickly with useful information captured that can be used for the root cause analysis afterward. If you have business critical services or applications under your management, having an organized approach to handling major incidents can save a lot of time and improve productivity. If you need to put a process together for your organization, here are some elements to take into consideration.

  1. Scope and Criteria: What characteristics would qualify an incident as being a “Major” Incident? This is very organization specific but generally there are two basic elements to consider, impact and urgency. Many organizations use the combination of those two elements to classify the priority level assigned to an incident, and that is a good starting point. Any incident that possesses a high degree of impact and high degree of urgency should probably be considered “major” and get the utmost attention. You may have other characteristics you want to define. For example, the outage of a particular application or for a particular line of business may trigger a “major” incident automatically. Since mobilizing the people and logistic necessary to handle a major incident is never a trivial exercise, clearly defined and agreed upon scope and criteria are mandatory.
  2. Roles and Responsibilities: Who will declare a major incident is in motion and own the process execution end-to-end? Since we are talking about major incidents, the Incident Management process owner in your organization will likely own this process as well. Will you have a person or a team designated as the “Major Incident Manager?” Will you rotate such role from individual to individual or from team to team? Depending on the nature of the technology failure or breakdown, how will the major incident manager find the appropriate technical resources to get involved? Will the major incident manager someone who is on stand-by waiting for the occasions to spring into action or will she have another “day-job” and wear the major incident manager hat when necessary? This will again depend on how your organization feels about this role. One thing I am certain of is that this role will require someone with the appropriate skills, environment know-how, and leadership experience to pull people together and execute the agreed-upon process. Another word, I do not believe this is a simple service desk phone dispatch type of role.
  3. Logistic and Facility: Everyone needs to know exactly what to do when the major incident process gets initiated. Will you have a dedicated meeting space or war room type of set up? Will people know what teleconference number to use in order to call in and to provide updates or to receive updates? Will you have a separate teleconference number to work through the technology aspect of incident recovery without cluttering with other non-technical discussions? Who will manage the conference call? What criteria determine when the conference calls start and end? In addition to the conference call, will you hold some kind of web meeting or online collaboration setup where people can share things on screen? Will you have some type of continual update via web or email, so people can stay informed? All these finer details should be planned upfront.
  4. Escalation and Communication: How will you define the communication interval and who will receive what communication at what point in time? How will the incident be escalated up the chain of command as long as the incident remains open? For example, you may define something simple as follow:
    1. At Hour 0: Major incident declared and the technical team contacted by phone. Director of the technical team and VP of IT notified via email.
    2. At Minute 30: Director of the technical team notified again via email with updates.
    3. At Hour 1: Major Incident Manager asks the Director of the technical team to join the conference in person. Another email update goes to the VP of IT.
    4. At Hour 2: Major Incident Manager asks the VP of IT asked to join the conference call for updates.
    5. At Hour 4: Major Incident Manager asks the business customer to join the conference call for updates and to discuss other recovery options.
  5. Other Considerations: How will this process connect with a downstream process such as Problem Management? Will you have the problem manager on the call as the incident progresses? What documentation or deliverables will the major incident process produce? Simple log of incident chronology, who participated the call when, important details shared at various point of the incident, official updates communicated, reasons for the incident closure, and other pertinent information about the incident probably should be documented at a minimum.

One thing for sure, all these considerations are too important not to get agreed upon beforehand. When the agreed upon details are not in place, it is simply not productive for everyone involved to try to figure out the process details during the heat of the battle. When that happens, most people have a tendency to go into the “headless chicken” mode – responsibility-dodging and finger-pointing start to spawn shortly afterward. In the next post, I will provide a sample process flow for further discussion.

Links to other posts in the series

Elevating the Service Desk to Handle Business IT Initiatives

Last week, I wrote a post discussing a potential scenario where the service desk acts as the focal point of IT services provisioning, communication, and support. I went through a scenario where there were many opportunities to provide good customer experience that makes interacting with IT a solid experience all around. It may sound reasonable and easy to do, but I know that such interaction can be hard to accomplish for many IT shops for a variety of reasons. It may be hard but I am advocating there is really no excuse for a Service Desk not to do its best to deliver the best customer experience possible in its organization.

Well, those interactions with the end-user are the relatively easy stuff.  There are many IT-provided services that are much more complex, and they are also some of the most frustrated interactions to deal with. Take in-house application implementation effort for example. A business team wants to introduce an application into the corporation and is looking to IT for help. IT first asks for a requirement document to be submitted, so it can get a better understanding of what the business customer is looking for. Due to the complexity of the application, a number of considerations have to be taken into account, such as hosting arrangements, network capacity, information security, audit requirements, and integration with other systems, just to name a few. After many rounds of exchanging information, questions and answers, meetings, and working with seemingly dozens of different teams in IT, the business customer cannot help but to feel confused and dazed with the myriad of processes and paperwork to work through.

We all know corporate policies and processes many times don’t make things streamlined and easy to work with IT. That does not mean we in IT cannot do what we can to make the interactions more productive for all involved. While many Services Desks deal with only the end-user-to-IT activities, I believe the business-to-IT interaction is something where the Service Desk can also contribute to delivering great customer experience. Using the same application implementation example, I put together a potential scenario where the business team and Service Desk work together to make things much more smoothly.

Scenario: Business to IT – Application Implementation

Many Service Desks today are set up to provide assistance to individual end-users only and will need a different set of staff and skill sets to work on the business initiatives. It seems logical to have the Service Desk team lead as many of these customer interactions as they can because often business initiatives in IT have a large end-user impact. For example, rolling out a HR self-service portal will have many end-user related considerations. Even for departmental specific initiatives like sales, marketing, etc., more than likely the end-users become the final recipient of these initiatives. By elevating the Service Desk to also handle the business initiatives on behalf of IT, the Service Desk’s maturity levels up and the business gets a consistent touch point. It makes very little sense not to consider this service option.

Manager Tools recently releases a podcast on Internal Support Roles and Responsibilities [], and I think what they discussed is something the Service Desk can take lead on. When the Service Desk can make people feel productive when utilizing corporate-provided information technologies and the business teams feel the SD team represents the gateway to the best of what IT can do for its constituents, it is win-win for everyone.

Links to other posts in the series

Fresh Links Sundae

Fresh Links Sundae encapsulates some pieces of information I have come across during the week. They maybe ITSM related or not entirely. Often they are from the people whose work I admire, and I hope you will find something of value.




Mark Horstman and Mike Auzenne gave guidance on how to obtain the requirements you need when working as an internal support provider. Quite applicable to IT service management in my opinion. Internal Support Roles And Responsibilities – Part 1 (Manager Tools)

Liz Ryan takes a HR situation and discusses why the intangible stuff like “culture” and “talent” more important. I think we struggle at times with the similar situation in IT as well. Managing the intangibles (Liz Ryan)

Frances Frei and Anne Morriss talked about strategies to improve service, and I believe IT in many organizations can do the same. Win on Service in a Tough Economy (HBR Blog Network)

Marshall Goldsmith discussed why knowledge workers’ wealth of knowledge may be worth more to their companies than the paychecks are to them. Show Your Employees You Care (Marshall Goldsmith)

Seth Godin discussed social media made it easier for people to talk about what they are up to and to find out what others are talking about. Spout and Scout [Seth Godin’s Blog]

The good folks at ITSM Lens compiled a list of essential ITIL terms and put some easy to understand definitions to them. Getting Started with ITIL: 35 Terms and Definitions Everyone Must Know (ITSM Lens)

Bret Simmons discussed how simple miscommunication can create a missing business opportunity and the importance of pay attention to small details. The Cycle Of Service Starts At Your Website (Bret L. Simmons)

Tammy Erickson provided insights into the challenges of using collaborative or social software inside business organizations. Why We Use Social Media in Our Personal Lives — But Not for Work (HBR Blog Network)

Aprill Allen provided a list of good tips if you need to get organized and get going with your change management effort in IT. Change Management in 7 Easy Steps (Knowledge Bird)

Credit: Image Courtesy of Wikipedia

A Potential Service Desk 1.5 Scenario

A couple of blog posts discussing the future of Service Desk (or Service Desk 2.0) caught my eyes last week. I mentioned them in my weekly Fresh Links Sundae post as well. I liked the various points of view presented, and it got me thinking. Well, I guess I probably will not know what a Service Desk (SD) 2.0 looks like until it is right in front of me. The notion still sounds like a bit far down the road for me. Frankly, I am more interested in what we in IT can do to improve the Service Desk function today (5 Questions You Should Ask Your Service Desk Team).  Many SD teams have endured an average reputation that in part, I think, was self-inflicted and in part was just victim of circumstance. Since there is no way to turn back so why not keep moving forward and improving.

I think it would be very cool for a typical Service Desk to do a lot more than what it is doing today. I like to envision a state where the Service Desk is the focal point of IT services provisioning, communication, and support.  The Service Desk is the team that makes interacting with IT a solid experience all around. They are also the team that makes people feel productive when utilizing corporate-provided information technologies. Moreover, they can be the team that represents the gateway to the best of what IT can do for its constituents. Call it Service Desk 1.5 or whatever. I have presented a possible scenario using a pretty common Service Desk interaction these days.

Scenario: End User to IT – Computer and Mobile Device Provisioning for a New Hire

If your organization is already doing that, congratulations. Moving forward, I think many Service Desk teams can and should do more to plot what they can leverage to improve the customer’s experience. A few ideas on the table are:

  • The team should make building a solid working relationship with its customer base is a key success factor. It is the same for any service organization. The team also knows that it cannot be great at everything, so it should try to understand what its customers value the most and least. The team will use the feedback and prioritize what they do – over-deliver on the stuff its customers care the most and maintain, or even phase out, the stuff its customers care the least.
  • In addition to better understand their constituents, the Service Desk needs to empower its customers to be more productive with the IT resources they have access to.  One example is the self-service feature. The more end-users can do for themselves, the better off everyone will be. Self-service does not mean Service Desk will become a faceless entity. Self-service takes care of the simpler, more routine stuff and leaves the Service Desk team to tackle the stuff that are better handled with more interpersonal interactions.
  • To support self-service and to promote productivity, the SD team should know what information the end users will find most helpful and how to make that information available with the least fuss. It is also important where every interaction the SD team has with the customers to be as transparent and predictable as possible. The customers should know what boundaries everyone is working with, what to expect, and stay sufficiently informed every step of the way. The SD team will also make the support information, and themselves, accessible and easy to find. A well-designed knowledge capturing and dissemination mechanism can only help.

To get better at what they do, the Service Desk cannot do it alone. They will need some help from IT and the organization. For example:

  • A number of these monitoring and follow-up activities can be labor intensive but can also be automated to a large degree. In addition, self-service feature is only meaningful if it is supported by a well-design automation that actually improves the overall experience.
  • Web 2.0 and social media technologies have greatly enabled what Seth Godin called the “Spout and Scout” interaction. The members on the Service Desk team likely have experienced this interaction themselves and use it on the daily basis. Perhaps the team can leverage the same interaction model and get closer to what their customers are doing? Unlike what the Facebook or LinkedIn needed to do in order to get people signing up and coughing up their personal information, the Service Desk already has a pretty good idea of who their customers are, where they are located, what they do, and what IT asset/resources they have access to. The customer census and the social media technologies are certainly two things the Service Desk can leverage and do more with.

Are those realistic and actionable scenarios? I think so. Given how many Service Desks are staffed, organized, and equipped these days, achieving those end states can be a formidable undertaking. Service Desks have always existed with the noble intention to help the IT customers, yet mandy SD teams I had worked really did not get the support they could use to be successful. One reason maybe that many SD teams have been stuck in this reactive question-and-answer mode of operation. We look the end-user Q&A and password reset activities as the necessary evil, stuff we have to do. As a result, many Service Desks also have a tough time justifying its importance and budget priority when competing with other IT functions.

Good service costs money and time. In most organizations, the Service Desk budget rarely looks like the uptrend curve of a growth stock. Diverting effort and fund to do what would help the customers the most will take some difficult choices. Streamlining the Service Desk’s internal working mechanism and simplifying how the Service Desk interact with its customers can also help in freeing up the needed resources. I think it is the time for many SD teams to get a crystal clear picture of what the organization and its customer base really care about. Do less of being all things to all people and do more of delighting its customers along the way.

Links to other posts of the series

Major Incident Review Process Design – Part Two

This post is the part two (and concluding part) of a series where we discuss the Major Incident Review process and how to put one together. Previously we discussed the elements and considerations that should go into the process design. We elaborated those considerations further with a sample process flow. We will describe the process activities further along with a reporting template you can use to implement the process.

Sample Incident Report Template

Sample Process Design Document

The process design document provides a detailed description of the fields within the report template, so no plan to repeat. I think there are two factors to keep in mind when undertaking such process. First, don’t do the process just for the sake of doing it. Do it because your organization genuinely wants to improve service by eliminating as many of these incidents over the long-term as you can. If the organization chose not to implement certain solution for some reasons, costs, technical complexity, longevity of the technology, regulatory/compliance, or whatever, at least document the discussion. That way, it shows that the organization understood the risks and chose to accept them.

Second, perform meaningful measurements and, again, use the statistics to improve service. For example, if the majority of the incidents are reported by the end users, perhaps that is giving us a clue that we should be more proactive and beef up the automated monitoring? If a particular technology area has been experiencing more major incidents than the other areas, perhaps we should figure out what ills are plaguing the area and fix what are broken? If a particular business unit or segment has been experiencing more major incidents than the other segments, perhaps we owe it to the business communities to figure out what we can do to make things better? The business impact information we capture will enhance our understanding of the incidents and help us in formulating the solutions that make sense for the business.

Most organizations I know practice some type of incident review process, so I hope the information presented so far has been helpful. Please feel free to suggest other approaches that have worked for your organization.

Links to other posts in the series

Fresh Links Sundae

Fresh Links Sundae encapsulates some pieces of information I have come across during the week. They maybe ITSM related or not entirely. Often they are from the people whose work I admire, and I hope you will find something of value.




What would the future of Service Desk looks like? James Finister describes his view. Service Desk 2.0 (CORE ITSM)

Another view point on the future of Service Desk from Rob England. User self-help – a skeptical view (The IT Skeptic)

Charles Betz discusses a number of dynamics that impact the future of IT management. Next generation IT management (Integrated IT Management)

Robert Stroud describes what a case of brilliant customer experience he came across when he recently traveled to Europe. Customer Service making a difference and changing the way I travel to Europe (CA Community)

Joshua Simon discusses how ITIL best practices can contribute to improved information security. How ITIL Addresses Security (The ITSM Lens)

Marshall Goldsmith talks about what coaching advice he might give to two coaches, Joe Torre, the former coach of the New York Yankees, and Joe Girardi, the new coach of the Yankees. Torre and Girardi: Coaching the Joes (Marshall Goldsmith)

Mark Horstman and Mike Auzenne talk about why it is important to make sure your priorities are reflected in your calendar. Life is what happens… (Manager Tools)

Seth Godin asks “Who is your customer?” (I think it is also a concept that is vitally important in ITSM and drives how services can and should be provided.) Who is your customer? (Seth Godin’s Blog)

Liz Ryan discusses ways to make your LinkedIn profile work harder for you. 25 Ways to Make LinkedIn Work for You (Bloomberg BusinessWeek)

Umair Haque suggests that it is time to get “lethally serious” about doing things that actually matters. Create a Meaningful Life Through Meaningful Work (HBR Blog Network)

Credit: Image Courtesy of Wikipedia