Complex Adaptive Systems (ii): thinking about emergence and ITSM

(This is part two of a series exploring Complexity Science and the concept of complex adaptive systems in the context of ITSM. The first article, a primer for ITSM on these concepts, is here)

In my previous article, I explored the core characteristics of complex adaptive systems. Now, I am going to focus on one particular trait of such systems, emergence, particularly in the context of increasing our understanding of its impact on IT Service Management.

The new ITIL 4 material, notably the ITIL 4 book, High Velocity IT (I was a co-author of this book) introduces complex adaptive systems to the ITIL narrative as part of its broader exploration of DevOps, Agile, and other transformative ways of thinking and working. The book touches specifically on emergence:

Complex adaptive systems exhibit behaviour that cannot be predicted, but can often be explained retrospectively.

However, this topic is one which I feel warrants deeper exploration beyond this brief mention. Complexity science is a developing and relatively new field, with a growing body of analysis and understanding, and I hope that this article will demonstrate the potential of extending our exploration of emergence.

Emergence is a phenomenon which occurs as a result of the composition of complex adaptive systems as an array of independent, interacting agents.

Over time, the development and interaction of those agents tends to create novel characteristics and behaviours. Emergence means that these novelties are not derived solely from the properties of any single part of the system. Instead, they develop unpredictably as a result of the wider interactions of some of those parts.

In an 2002 Harvard Business Review article entitled “Predicting the Unpredictable”, Eric Bonabeau discussed a common real-world example of an emergent phenomenon — a traffic jam:

Although they are everyday occurrences, traffic jams are actually very complicated and mysterious. On an individual level, each driver is trying to get somewhere and is following (or breaking) certain rules, some legal (the speed limit) and others societal or personal (slow down to let another driver change into your lane). But a traffic jam is a separate and distinct entity that emerges from those individual behaviors.

The traffic jam, however, is a negative emergent trait. It is important to note that emergent behaviours may also be positive. One of the most significant examples of this is biological evolution. Living organisms are complex adaptive systems which undergo emergent modifications, some of which persist through generations to increase the overall strength of the species.

A large, modern digital system is in reality a complex and evolving socio-technical structure, underpinned by an intricate and dynamic set of diverse technical subsystems. These systems are often subject to small and very frequent change, particularly when stakeholder organisations are embracing DevOps. However, when such systems exhibit the characteristics of a complex adaptive system, we must also expect them to exhibit emergent characteristics and behaviours beyond the intention of any human designers or controllers.

For instance, to build on the example outlined in the first article: a passenger’s airline seat selection, while visually straightforward to the user, invokes a set of interactions across a broad range of financial, customer, operational and other systems.

It is possible to find numerous emergent behaviours which have developed within such an extended system, independent of any specific decision to implement or instantiate those behaviours. For example, consider the following developments:

  • Some years ago, websites such as Seatguru began to compile and share detailed data about the seat layout of different aeroplane types flown by each operator, augmented by passenger reviews of each seat.
  • Passengers, now provided for the first time with consistent visibility of good and bad seats, increasingly sought the best seats. This additional demand drove some services such as TripIt to monitor airline seat allocations using new data interfaces made available by airline IT systems. This enabled them to provide alerts to premium users when their preferred seats become newly available.
  • Meanwhile, the increased public awareness of certain better seats enabled airlines to monetise the resulting increase in demand for those seats, boosting revenue or enhancing the benefits of premium status.

These shifts in behaviour and product are emergent due to the way they came about. Rather than having been conceived consciously, they are a result of a combination of changes to the features of the system and the behaviours of actors within it.

The ITSM perspective is interesting. In the example of the airline seat changing experience, a service level manager might reasonably set an expectation that customers should be able to submit their seat change and see a response within a few seconds. Architects, developers and operators might collaborate to build infrastructure and software which delivers this specification, in line with the user behaviours they understand at the time of inception. Service Level Agreements might be and monitored to underpin and enforce these standards.

However, such a system may fail to account for the emergent behaviours of its users, if (for example) new utilities and data lead to a substantial increase the proportion of users requesting seats in the first few minutes of check-in.

This is a very important thing to understand. As managers of complex IT services, however we may observe and represent a complex system today, we must be aware the same system in 3 months, or 6 months, or a year will not be identical. We can not prevent this because emergence is an defining characteristic of a complex system.

However, it is still possible to exert some influence over emergent behaviours. In Dave Snowden’s Cynefin framework, for example, analysis of complex systems includes consideration and manipulation of the constraints acting both at the boundary of the system, and in the interactions between its components. Dave has explored and developed this concept extensively through a series of articles on his blog, and they are well worth reading.

Constraints are an interesting basis for analysing the challenges facing an ITSM organisation tasked with management, support and operation of a complex system. Constraints may be useful in limiting negative emergent behaviours. However, they may also suppress the positive evolution of the system, and this is a critical point for IT Service Management — a discipline already facing reputation challenges for its perceived tendency to create obstacles to fast, positive change. As Snowden himself stated in a recent blog on a different (but related) topic:

If you want to change things then you need to let a thousand flowers bloom; Some will thrive, some will not; you can’t determine in advance what will work… in general, you want a wildflower meadow, not a formal garden.

So, our challenge in ITSM is this:

  • If we do not establish sufficient constraints in the system, negative emergent behaviours may impact the successful delivery of a service.
  • If we over-constrain the system, we may prevent beneficial traits from emerging.

In a her 2011 paper Concepts in Complexity Engineering, published by Imperial College, Regina Frei described the challenge of balancing freedom and control in complex systems:

One of the challenges in engineering is the trade-off between the system specification by the designer and the creative freedom of the system. More freedom means less control over the system’s behaviour. Engineers need to find ways to delimit the system’s behaviour while still allowing it sufficient creative freedom to localise solutions in an adaptive way.

Complexity Engineering, the topic of Frei’s paper, is another relatively new area of study which is developing closely and in parallel with Complexity Science. Complexity Engineering specifically explores the intentional development of evolutionary complex systems, which inherently benefit on an ongoing basis from positive emergence.

Frei’s work gives us an interesting visual representation of this challenge, which seems useful to help us to address the challenge of constraining complex systems in an appropriately balanced way. She proposes modelling different areas of system behaviour as either Desired, Allowed or Possible.

3 Concentric boxes describing states of complex adaptive systems. “Desired” (inner), “Allowed” (middle), “Possible” (outer)

As Frei notes:

  • When the system behaviours are in the desired box, no corrective actions are necessary.
  • If behaviours move into the allowed state, they have diverged from what is desired, but no drastic measures need to be taken as long as the system remains there or drifts back to desired.
  • If the system moves into the realm of what is possible but not allowed, immediate resolution action is required.

Adopting this model requires us to find coherent ways to describe and recognise these behavioural states, and to apply them both at a broad system level, and also to individual components of those systems. There is clearly more work to be done here. Nevertheless, perhaps this structure gives us a way to adapt our established practices (such as Service Level Management) for the increasingly complex systems which we are managing?

The intersection of digital transformation, DevOps, and ITSM. Articles by a senior Product Manager in the enterprise service management space. Personal views.