Enterprises are increasingly having to deal with growing data volumes, fuelling demand for advanced analytics and machine learning tools to help them make sense of it all.
This, in turn, has placed new requirements on IT infrastructure to cope with the computational demands of such techniques.
It is now a decade or so since the notion of “big data” became a hot topic among CIOs and business decision-makers in enterprise IT, but many companies are still struggling to implement a successful strategy to make full use of the data they have, and become more insight-driven.
The data springs from numerous sources, such as machine-generated or sensor data from the internet of things (IoT) and other embedded systems, transactional data from enterprise systems, or data from social media and websites.
As a result, enterprise workloads are evolving beyond traditional ones that revolve around structured datasets and transaction processing, and are starting to incorporate analytics and other techniques, such as artificial intelligence (AI).
According to IT market watcher IDC, AI will be a core component of enterprise workloads by 2024. It believes that for three-quarters of enterprises, 20% of their workloads will be AI-based or AI-enabled, and 15% of the IT infrastructure will be accelerated by AI.
However, organisations are finding that integrating advanced analytics and AI techniques – including machine learning – into workloads can put a strain on their IT infrastructure.
Drawing parallel lines
Traditional central processing unit (CPU) architectures, in particular, have proved less than optimal for some of these techniques, which often call for a high degree of parallelism.
It quickly became clear that this was the sort of problem that graphics processing units (GPUs) could handle. Designed to offload the burden of graphics processing from the CPU in games, GPUs have a lot of relatively simple processor cores, and can handle a large number of computations in parallel.
Other hardware accelerators have also been added to the mix, such as field programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs), all of which have their varying strengths when it comes to accelerating workloads.
FPGAs, for example, offer good performance for neural networks and can be reprogrammed, but are much trickier to program than developing software to run on a GPU.
These accelerators are typically plugged into a standard server and work in conjunction with the existing processor cores. This is known as heterogeneous computing, and to work best, it requires careful integration of the different types of compute engines into one system to deliver optimal performance per watt.
A good example of such a heterogeneous system is Nvidia’s DGX line. These products combine Intel Xeon processors with a number of Nvidia’s Tesla V100 GPUs into a system aimed squarely at deep learning and other demanding AI and high-performance computing (HPC) workloads.
The first such system, the DGX-1, used Nvidia’s own NVLink interconnect to join eight Tesla GPUs, but its successor required Nvidia to develop an NVLink switch in order to provide enough bandwidth for inter-GPU traffic between 16 GPUs without taking a hit on performance.
Although the DGX line is intended as a dedicated system for deep learning and other demanding workloads, it does highlight some of the issues with deploying accelerators. The DGX-2 with its 16 Tesla GPUs cost $399,000 at launch, making it the kind of hardware purchase you only make when you absolutely need that level of performance.
Enter composable infrastructure
One answer to this conundrum may come in the shape of composable infrastructure, a system architecture that disaggregates some of the hardware of a traditional server into pools of resources. The thinking behind this concept is that the appropriate resources can be pulled together under software control to deliver a system that precisely matches the requirements of the workload it is intended to operate.
With existing systems, the number of processors and other resources such as memory and storage is largely fixed, or not easy to change as and when required. This typically leads to suboptimal utilisation, with systems having more resources than required by the workload.
The ability to compose a system on demand from a shared pool of resources should mean that expensive hardware such as GPUs or FPGAs can be shared across a bank of systems and allocated as and when needed, instead of being permanently installed in every system that may need to run a workload that needs an accelerator.
However, as analyst Gartner detailed in a recent report (Understand the hype, hope and reality of composable infrastructure), the word “composable” has been used to describe a wide range of equipment from various suppliers.
Composable infrastructure is also currently held back by the difficulty in disaggregating DRAM from processors, as well as a lack of cross-supplier application programming interfaces (APIs). The latter means that composable kit from one supplier is unlikely to work with that from a different supplier.
What does composable infrastructure look like? As mentioned above, there is currently no practical technology for disaggregating main memory from the processor without incurring a performance penalty, so early composable platforms have compromised on using fairly standard x86 servers as the compute resources, combined with a separate pool of configurable storage.
Examples of this are HPE’s Synergy platform and Dell EMC’s PowerEdge MX portfolio. Designed around proprietary enclosures that fit a mix of compute and storage sleds, these resemble blade servers, but with a switched SAS fabric connecting the compute and storage components. This enables the SAS drives to be linked to the compute sleds in various configurations.
Finding the right fabric
A key part of a composable system is the way the component parts interconnect so they can be used together. With Synergy and PowerEdge MX, this is provided by a switched SAS fabric linking the compute and storage sleds. DriveScale, another composable infrastructure firm, uses an Ethernet fabric.
However, if you want to compose more than just storage, you need a connectivity fabric with greater flexibility. One option is PCI-Express (PCIe), which was created as a high-speed interface for connecting devices directly to the processor, but has now been adapted as a means of connecting external hardware as well.
One company that is pushing this approach is Liqid, which bases its composable infrastructure platform around a PCIe switch. This enables hardware such as GPUs, FPGAs and storage to be fitted into separate enclosures in a datacentre rack and shared among several standard x86 servers. The PCIe switch serves as the control point to compose the required configurations.
Another company, GigaIO, has taken a similar approach, creating its own interconnect technology, FabreX, based on PCIe Gen 4. The bandwidth available through PCIe Gen 4 allows the firm to target HPC deployments with its platform.
This flexibility of composable infrastructure means it may be an attractive option for organisations seeking the best platform to support new and emerging workloads that incorporate approaches such as big data analytics and machine learning.
“I do believe that new workloads align way better with composable infrastructure, as it allows you to right-size hardware configurations as workload requirements change and evolve,” says Julia Palmer, research vice-president at Gartner.
“In addition, use of high-cost components is driving the requirements for better utilisation of, and continuous optimisation of, bare metal hardware resources, which is one of the unique benefits of composable infrastructure.”
Pros and cons
However, organisations interested in composable infrastructure should consider it carefully. According to Palmer, composable infrastructure runs the risk of creating yet another silo in the datacentre that has to be managed separately from existing systems.
This is because some composable infrastructure products do not come with built-in automation and will require additional integration with third-party automation tools to enable infrastructure-as-code and intelligent infrastructure capabilities.
“Composable infrastructure is enabling technology and can only be fully leveraged within organisations that are ready to embrace fully automated server build services, which currently are suffering from lack of standards as every vendor has its own APIs,” says Palmer. “There is a hope for the Redfish ecosystem, however it is too early to tell if all of the vendors are going to fully embrace it.”
Redfish is the name for specifications drawn up by the Distributed Management Task Force for a standard API for management of software-defined IT infrastructure, including composable systems. It is supported by the Ironic bare metal provisioning service in OpenStack and the Ansible automation tool, as well as the baseboard management controller inside some servers.
Composable infrastructure can therefore be regarded as a technology still very much in its infancy. This can be seen from the fact that current platforms are unable to disaggregate CPU and memory resources and compose them independently.
There are moves to rectify this, such as via the Gen-Z memory fabric, which is designed to allow memory semantic operations over connections ranging from a simple point-to-point link to a rack-scale, switch-based topology.
In summary, the need for costly accelerator hardware such as GPUs and FPGAs could drive demand for composable infrastructure. However, as Gartner advises, companies must carefully weigh up the cost and benefits of composable infrastructure compared with traditional infrastructure alternatives.