This post is the third and the final installment of the series for designing a problem management (PM) process for your organization. Previously we discussed the elements and considerations that should go into designing a problem management process. We elaborated those considerations further with a sample list of process requirements, a sample process flow, and a sample RCA form. We will assemble all the information together into one process design document that can be used to implement the process.
In addition to the process requirements and the process flow, I believe a process design document should call out the following information pertinent to the implementation of the process. For example…
- The Policy section outlines what policy statements (IT or corporate) governs the process and what expectations the organization wants to set for the process.
- The Scope section specifies which incidents or events will generate a problem record. Your organization may have a pre-defined set of criteria on how problems are triggered, and those criteria can go into this section. Some organizations may also choose to group a series of closely related incidents and trigger a problem record for those incidents.
- The Roles and Responsibilities section outlines the roles that will be involved in the process and their corresponding activities and responsibilities.
- The Artifacts and Communication section describes what documentation methods will be utilized by the PM process. It provides the procedural information necessary to carry out the PM process. The communication protocols section describes the recommended communication methods and their frequencies.
- The SLA and Metrics section describes the metrics that will be used to measure the process performance. The tutorial document has outlined some examples. Develop and measure the metrics that you can capture reliably and that your organization also cares about.
To reiterate, the primary goal of the Problem Management process is to identify the problems in the IT environment, so we can eliminate them by performing root cause analysis on the problems. As a capable IT organization, we should be able to correctly diagnose the root causes of just about everything that goes wrong within our IT environment and to implement solutions so similar problems or incidents will not reoccur. With proper documentation, the Problem Management database is a great learning tool. Also, another benefit of having a well-run problem management process is having the ability to review organizational decisions made about addressing a particular problem. Known errors do not need to be purely technical. They could also be the documented decisions about how we plan to address certain problems. The root causes, solutions (proposed or implemented) and the workarounds documented as part of the Problem Management process will benefit the Incident Management process immensely when similar incidents surface due to the recurrence of a problem.
I hope the information presented so far has been helpful. Please feel free to suggest options or other approaches that have worked for your organization.