The computer industry has been driven by a simple formula that states computing power will double every 18 months. This is the essence of Moore’s Law and it has meant that software has been able to take advantage of a doubling in processor performance, without customers having to spend twice as much on hardware.
Nevertheless, it has become increasingly difficult for chip-makers to continue innovating at the pace required to maintain Moore’s Law. This has become most apparent as organisations start to develop machine learning and artificial intelligence (AI) applications, where traditional processors – or CPUs – have failed to provide the level of performance needed.
Instead, organisations looking at where to take AI are starting to use alternative hardware such as graphics processing units (GPUs) and, more recently, field programmable gate arrays (FPGAs), which promise to deliver better levels of performance.
“Over the past 10 years, Moore’s Law has reached a threshold. All companies that rely on CPUs face this threshold and need to change the way they approach computing,” says Pierre-Etienne Melet, a senior manager in the applied research centre at Amadeus.
He says sophisticated techniques in software engineering have helped limit the impact of Moore’s Law’s performance plateau being reached. But there is also growing awareness today of hardware innovation based on GPUs, FPGAs and ASICs (application-specific integrated circuits) that promise to accelerate computationally intensive applications such as AI.
“In our innovation group, we started to look at how FPGAs can be used to accelerate machine learning,” says Melet.
Amadeus worked with a team of hardware engineers at ETH in Zurich to investigate the use of FPGAs in inference applications that make decisions based on machine learning.
GPUs tend to be the first place people go when they need to overcome the latency issues associated with using traditional CPUs to run high-performance AI applications. However, while GPUs solve the problem of latency, Melet says they tend to be power hungry and are therefore not very power efficient. So the challenge is not only in being able to run AI algorithms computationally quickly, but also achieving high performance with efficient use of electrical power.
From a power efficiency perspective, Melet says FPGAs offer an order of magnitude lower power consumption compared with GPUs, which potentially makes them a good candidate for running AI algorithms efficiently.
Beyond the power efficiency argument, Melet says GPUs are not the best choice to run certain data-heavy AI applications. “GPUs process in parallel,” he says. “This means that only a single GPU instruction can be run across the dataset.”
However, he adds: “On an FPGA, you have unlimited granularity. You can process different instructions in different pieces of data in parallel.”
That is the benefit, but there clearly are drawbacks, compared to the GPU approach. From a skills perspective, programming a GPU is relatively mature. People understand how to do it and software development tools are in abundance. However, Melet notes, “an FPGA is less flexible; programming is more archaic”.
This is due to its heritage. In the past, FPGAs were used by hardware engineers to prototype integrated circuits. The engineers would use the FPGA to test their designs, before committing them to silicon. Melet says this means programming an FPGA is more complex, claiming it is a different mindset compared to software engineering.
However, in the study Amadeus conducted with ETH, the team found that when an algorithm is programmed to run on an FPGA, the end result can be a lot better than when the same program is run on the most advanced GPUs.
Making decisions quicker
The research involved running a decision tree algorithm on AWS using a CPU instance, a GPU instance and an FPGA instance. The team found that running the same algorithm on an AWS FPGA instance delivered 130 times more throughput than when the same algorithm was run on an AWS CPU instance. The FPGA also delivered four times the throughput of the GPU instance. In effect, compared with using the GPU instance on AWS, Melet says it is possible to process four times more data on the FPGA.
And by running the same algorithm on AWS, the team at ETH and Amadeus were able to demonstrate considerable cost savings. They found that running their decision tree algorithm on an FPGA instance was significantly more cost effective than the equivalent workload running on a CPU or GPU instance. According to Melet, the FPGA instance worked out seven times cheaper than running the equivalent workload on an AWS GPU instance; the CPU version was 28 times more expensive.
For Melet, the study demonstrated that it is feasible to use an FPGA to process AI data quicker, more power efficiently and at lower cost. “We have an idea of the kind of algorithms we could accelerate,” he adds.
FPGA reality check
The FPGA represents a new frontier for software engineering. The first challenge will be a shortage of people with the right skills.
At Amadeus, Melet found a number of individuals who had studied hardware engineering techniques at college and university.
“A lot of people studied hardware engineers at college, so they had a certain level of understanding,” he says. “They found they had no use for these skills until now. That skillset of programming an FPGA for prototyping an integrated circuit can now be used in production. This lets them renew their knowledge.”
It is early days, but Melet expects chip-makers and startups to develop and refine software stacks, tools and libraries to make it easier for software engineers to program FPGAs.