I believe that this underlines the fact that, unfortunately, the traditional high availability/redundancy approach to infrastructure services is not sufficient enough to deliver the 24/7 availability that is demanded by the modern business world. However, I believe very little will change, with regards to the evolution of data centres in the future. In the short term, data centres are going to continue being architected at a facility line level and will comply with the Up-Time Institute’s guidelines for generators, power feeds and cooling etc. However, the key change that we will start to see in the near future is that the infrastructure and applications we integrate into those data centre facilities will begin to change.
The current applications and infrastructures that are usually implemented into data centres need to become much more tolerant of failure, and much better at “self-healing”. By this, I mean that the applications will be able to understand when it is not working properly, and then automatically adjust itself to restore normal operation, all without the help from a human engineer. Not only would this reduce the amount of time lost in down-time, but it would reduce the amount of time needed to manage the system as well, as resources can be reallocated quickly, while upgrades can be implemented as soon as they are required.
This is important, because hyperscale cloud vendors have built their platforms on the premise that facilities and infrastructure might fail. In some cases, this is caused by adding all that resilience to an application, which means too much complexity has also been added to it. This makes the application more difficult to manage when implementing changes, such as patching or upgrades to the software.
Of course, not all applications are designed to be self-healing anyway, and it will probably be many years before applications are refactored or replaced in a system. This means that organisations need to understand the risk, mitigate it, and then prioritise the refactoring of those applications which are of the greatest criticality.
Despite the concept of self-healing systems being attractive to IT directors, there are several difficulties that users may encounter when trying to implement the self-healing applications. For example, one issue is that the application would need to be closely linked to the system management software, but if it is linked too closely, then the application would need to support proprietary interfaces for the system management, rather than having open standards. This would just add complexity to the system and would incur additional expense. Neither of which is desirable for the organisation using the application.
There are managed service providers that have mastered handling these types of applications though, and over the last 5 years, I have seen an increasing market shift to using such managed service providers. With data centres unlikely to adopt self-healing applications in the near future, managed service providers will become more crucial to organisations who want to refactor their critical applications in the next few years. This will continue into 2017 and beyond, in parallel with the application shift to self-healing.
In summary, until all applications are developed and created to be self-healing, then our current data centres will need to continue to provide an environment designed around redundancy and resiliency. This is because the levels of duplicity and complexity related to self-healing is expensive to procure, maintain and operate. The only solution in the short-term is to enlist a managed service provider to help devise a solution to minimise the possibilities of downtime and outages. Self-healing applications are most certainly the future of data centres, and the focus will be on building application-aware data centres, rather than on those that are focussed on infrastructure and facilities.