Written by David Lareau, IT Service Management expert.

Most business units today rely on IT departments to operate normally, and the stability of IT is essential to keep the business running smoothly. The IT department must provide reliable, consistent, robust and adaptable IT services.

I have summarized here a short list of action points that need to be monitored to ensure continuity

Action points

Measure and share key performance indicators that show business continuity :

How to calculate the service availability key performance indicator (KPI)

We sometimes hear data centers claiming they offer 99,99% availability on their cloud hosting service. I will explain what exactly this means and how KPI is calculated for an IT service.

Importance of this KPI

The Service Availability Key Performance Indicator shows the basis of your client offering. You might be surprised when calculating your indicator for the first time and you may not want to share it externally too quickly. The only way to improve your availability rate is to start measuring and sharing it with the teams involved. Over time, you will notice improvements in the accuracy of the tickets recorded. The implicated persons will come up with suggestions to avoid further service outages.

Precautions

It isn’t easy to predict or determine which incidents affect the service availability. This is key, because an issue in an underlying item may not necessarily affect the final service. For example, when an incident happens on a server disk drive that has redundancy, it usually has no effect on the final service. This is easy to determine. A not-so-easy to determine example is when you try to measure the service availability of a service that cannot be monitored continuously from start to finish and involves multiple providers (ex: an e-commerce platform availability – the product search display, cart adding, payment processing and order fulfillment). For such a complex workflow, the service availability is calculated using the weakest link method. Each component of the workflow that has an effect on the whole service will be considered in the calculation and the lowest service availability rate will count as the total rate for the service.

Data needed to perform the calculation of the service availability are:

Total time over a month (in minutes)

Explanation:

We need to calculate the total time of minutes per month (Ex: 31 days x 24 hours x 60 min = 44 640 min per month)

Planned maintenance time (in minutes) (optional)

Most of the Service Level Agreements include either a reserved time slot or a minimal notification delay for planned maintenance with a maximum time duration. If your service agreements include such a condition, then you need to exclude the planned maintenance time from the service availability calculation, as I will explain below.

How to find this information

You can find the actual planned maintenance time in your monitoring systems if it keeps monitoring during the maintenance time slots. Although, I would recommend placing your monitoring system in a planned maintenance mode, to prevent it from calculating maintenance as downtime. Then, you may measure the time spent on maintenance within your change management system instead. It is a good practice to have the change plan on hand throughout the whole process and to update the planned timing with actual minutes.

Unplanned unavailability time (in minutes)

How to find this information

Produce a report of major incidents that affected the service (from ticketing system and monitoring systems)

Critical information to measure

Unavailable time, which is the incident duration (Incident resolution time minus incident start time) 

 

Many service desk suites (such as BMC Helix ITSM) automatically calculate the service availability rate by linking incidents to the selected service on the CMDB. You can achieve the same kind of automation with a simple system: think of a way to identify every situation where an incident affects a service availability (a simple checkbox within the ticket). This will allow you to create a monthly report of unavailability time.

 

Calculation details :

Ex:

 

Service Availability:

Note : as explained above, the planned maintenance time may be excluded from the calculation if your agreements with the client are built this way.
 

Now that we’ve established the key points that make a strong foundation, I will be preparing my next article about the profitability of an IT department.

 

By David Lareau

Leave a Reply

Your email address will not be published. Required fields are marked *