Predictive analytics is set to turn the world of IT service management, and in particular Incident Management, on its head. After all, it has already done this for IT Capacity Planning, where it is now possible to predict and avoid future incidents at a workload level.
Within IT capacity planning, forecasting (predicting, if you like) has always been a key feature of the discipline. It was used to ensure that large chunks of demand, either through growth or change, could be met while focusing on the strategic horizon rather than the day to day operation. If there are capacity issues, the Service Operation process of Incident Management informs the Service Design process of Capacity Management to allow it to be dealt with as part of future Service Design activity.
Incident Management should inform IT Capacity Planning about incidents logged due to capacity or performance issues, whereby this intelligence would then be used to assist in the diagnosis and resolution of incidents. The idea that Capacity Management informs Incident Management of future and avoidable incidents, or indeed how to deal with them, is a relatively new concept.
Playing the tactical game
Technological advances have opened many new areas of innovation and opportunity in this space. Virtualization, automation, big data and predictive analytics have empowered IT capacity planning to extend into day to day management at a more granular and forensic level, rather than focusing solely on strategic activity. The following are the four major drivers which have spurned on this evolution:
Virtualization – or more importantly – the hypervisor
Whilst allowing multiple virtual workloads to operate on a single physical machine should make life more difficult, it actually simplifies things by reducing the number of information sources that need to be interrogated.
When dealing with different system management tools, vendors and formats consider the amount of data points generated. Let’s take a 10,000 server estate over a single 24 hour period, capturing data at 5 minute intervals – this would generate almost 3 million data points. For the information to be used for predictive analysis, we would recommend at least 30 days’ worth of monitoring data in order to gain worthwhile insight. Without automation it would take an army to schedule the retrieval, aggregation, cleansing, loading and transforming of the data from a number of bespoke sources in a meaningful timeframe.
Big Data delivers the ability to store the massive amounts of data in a way that makes sense and allows for further manipulation. With associated hardware advances, the cost of storage, scalability and more powerful compute have made Big Data a reality.
And finally, analytics provides the ability to churn data in a multitude of ways, using pattern matching and algorithms to analyse and provide insight into an organisation’s IT operation that would otherwise go unnoticed. Whether that be an over utilisation of, or an impending shortfall of resources. The analytics available today are essential if IT managers want to keep on top of the complexity and scale of their IT estate. In the IT environment of today, IT managers need to be confident in their knowledge of their IT infrastructure, and the various changing demands placed on it, in order to see what’s around the corner and avoid potential incidents.
For IT capacity planning, the unit of currency has reduced from physical machine to individual workload. Reducing the timeframe to provide short term tactical information while improving our ability to understand and model long term strategic actions. Changing the relationship between incident management and IT Capacity Planning allows you to identify shortfalls in advance, sidestep the avoidable and turn your Incident Management process on its head.