Artificial intelligence: powering the deep-learning machines of
tomorrow
www.infineon.com 3
04-2017
1 Overview of artificial intelligence
Humans are smart, achieving intelligence through years of learning and data accumulation as well as arguably
getting “wiser” with age. Computers could be considered “smart” due to data retention capabilities but until
recently lacked the capability to autonomously learn from these large databases in order to execute tasks or
make decisions. While a human brain consumes 20-30 W of power, the latest learning systems are consuming
power at levels that would support a small town as they learn to become 'artificially intelligent'. While we can
debate whether computing is getting 'smarter' than humans, it is impossible to debate that the requirements
for powering this new generation of supercomputer have changed dramatically.
In some ways, the approach taken to AI deep learning is quite similar to human development where computers
continue to learn through exposure. In the example below, a neural network is fed with thousands of training
images that are processed via multiple layers in order to build experience and knowledge.
As a result of this computer intensive and power hungry learning process, the network is eventually able to
distinguish a squirrel from a chipmunk or a fox. The goal is to achieve AI learning in the shortest amount of
time, thus parallel computing power is maximized to linearly improve computation times.
The high power consumption of today's AI is driving changes in the computing architecture to replicate neural
networks that mimic the human brain in an effort to reduce power needs. Traditional Central Processing Units
(CPUs) are architected to be very flexible to support a wide variety of general-purpose programs and are not
optimized for very specific and repetitive tasks such as AI learning.
Many of the necessary functions for AI can be performed by Graphics Processing Units (GPUs). These GPUs are
designed to repeatedly perform complex mathematical functions more efficiently, can be conveniently
connected in parallel to further increase computing power and be opportunistically applied to learning
applications. With slight modifications, these latest GPU devices process 3x to 10x faster while consuming the
same power as a CPU. The early AI market has been dominated by NVIDIA; their DX1 GPU super computer
contains eight Tesla P100 GPUs, each capable of 21.2 TeraFLOPs, and requires 3200 W of total system power.
Multiple DX1s connected in parallel are required to form an effective neural network.
Honing the technology even further, Tensor Processing Units (TPUs) are ASICs that have been developed
specifically for machine learning. Based on GPU platforms, reduced floating-point accuracies allow more
compute capability per clock cycle. Rasterization and texture mapping features are also removed to further
improve computation efficiency. Google launched the first TPU in 2015 and Intel is expected to launch
LakeCrest this year, targeting Deep Neural Network (DNN).
To learn, networks need to be able to sense. Local 'edge devices' include sensors, cameras, data collectors and
local actuators. Connected to the central AI servers via high-speed wireless connections, these low power
devices are the eyes, ears and hands of the neural network. Estimates predict that there will be over 50 billion
edge devices connected to the network by 2020.
It should come as no surprise that, despite the power challenges, the market for AI is growing rapidly as
demonstrated by the (approximately) 40-fold growth at Google in the past two years.