In an IT environment, navigating through incidents and solving problems promptly is vital. Let’s break down the concepts of problem management in a way that everyone can understand, whether you are a techie or not.

Policies: Structuring Problem Management

Problem and Incident Tracking

In the IT world, “problems” and “incidents” have specific meanings. An incident is a sudden event that disrupts the usual operation while a problem is an underlying cause that could potentially lead to incidents. It is essential to track problems separately from incidents, maintaining a clear distinction for efficient management.

Single Management System

Imagine having a single diary where you jot down all the significant events and issues; this is what a single management system is in problem management. This system becomes a reliable source for referencing all problem-related data, aiding in smooth reporting and investigations.

Standard Classification Schema

Just like a library classifies books into various genres, problems should also be classified into distinct categories based on a standardized schema, facilitating quicker access and aiding in diagnostic activities.

Principles and Basic Concepts

Reactive and Proactive Problem Management

We deal with problems in two ways: by reacting to them once they occur (reactive) or by taking steps to prevent them even before they occur (proactive). Both approaches aim to identify the root cause of issues to avoid future recurrences. It is somewhat like treating a sickness versus taking vitamins to prevent getting sick.

Problem Models

Developing problem models is like creating a guide or manual that helps in handling recurring issues more efficiently. It allows for quicker diagnosis and aids in identifying affordable solutions as opposed to opting for expensive fixes.

Incidents versus Problems

While incidents focus on restoring the normal state of operations, problem management delves deeper to prevent such incidents in the first place. It’s about finding the real enemy hiding in the shadows causing all the chaos.

Problem Analysis Techniques

Analysis techniques in problem management are like the different strategies detectives use in solving mysteries. Let’s go through them:

  • Chronological Analysis: Mapping out the events in the order they happened, much like a timeline, to understand the sequence and find out what triggered what.
  • Pain Value Analysis: Understanding the overall impact of an incident or problem, considering various factors like the number of people affected, duration, and cost.
  • Kepner and Tregoe Method: A structured approach to finding the underlying problem through a series of stages that include defining the problem, identifying possible causes, and verifying the true cause.
  • Brainstorming: Gathering a group of people to come up with potential solutions through a free flow of ideas and suggestions.
  • 5-Whys: A simple technique of repeatedly asking “why” until the root cause is identified.
  • Fault Isolation: It involves re-executing events step by step to find where the fault occurred.
  • Affinity Mapping: A method of grouping large data based on common traits to identify potential root causes.
  • Hypothesis Testing: Coming up with educated guesses for potential root causes and testing each hypothesis to find out if it is true.
  • Technical Observation Post: Real-time monitoring of events by a specialist team to catch the exact cause when problems occur intermittently.
  • Ishikawa Diagrams: A visual tool that helps in mapping out the potential causes of a problem, represented through a diagram resembling a fishbone.
  • Pareto Analysis: A strategy that helps in identifying the most critical causes from more trivial issues.

Navigating Errors in Development Environments

When new applications or systems are developed, they may not always be perfect. Known deficiencies that are not fixed before the release should be logged with detailed workarounds to prevent re-diagnosing the same issues, saving time, and reducing support costs.

Conclusion

Understanding the policies, principles, and different analytical techniques can help streamline problem management in IT environments, enabling a proactive approach to identifying and solving issues effectively. It’s about building a robust defense mechanism that can foresee troubles and arm itself with strategies to combat them head-on, fostering a smoother, trouble-free IT landscape for businesses and individuals alike.


References: ITIL Service Operation, 2011 edition, ISBN 9780113313075