The human brain is much more efficient than the world’s most powerful computers. A human brain with an average volume of about 1,260cm3 consumes about 12W (watts) of power.
Using this biological marvel, the average person learns a very large number of faces in very little time. It can then recognise one of those faces right away, regardless of the expression. People can also glance at a picture and recognise objects from a seemingly infinite number of categories.
Compare that to the most powerful supercomputer in the world, Frontier, which runs at Oak Ridge National Laboratory, spanning 372m2 and consuming 40 million watts of power at peak. Frontier processes massive amounts of data to train artificial intelligence (AI) models to recognise a large number of human faces, as long as the faces aren’t showing unusual expressions.
But the training process consumes a lot of energy – and while the resulting models run on smaller computers, they still use a lot of energy. Moreover, the models generated by Frontier can only recognise objects from a few hundred categories – for example, person, dog, car, and so on.
Scientists know some things about how the brain works. They know, for example, that neurons communicate with each other using spikes (thresholds of accumulated potential). Scientists have used brain probes to look deeply into the human cortex and register neuronal activity. Those measurements show that a typical neuron spikes only a few times per second, which is very sparse activation. On a very high level, this and other basic principles are clear. But the way neurons compute, the way they participate in learning, and the way connections are made and remade to form memories is still a mystery.
Nevertheless, many of the principles researchers are working on today are likely to be part of a new generation of chips that replace computer processing units (CPUs) and graphics processing units (GPUs) 10 or more years from now. Computer designs are also likely to change, moving away from what is called the von Neumann architecture, where processing and data are in different locations and share a bus to transfer information.
New architectures will, for example, collocate processing and storage, as in the brain. Researchers are borrowing this concept and other features of the human brain to make computers faster and more power efficient. This field of study is known as neuromorphic computing, and a lot of the work is being done at the Interuniversity Microelectronics Centre (Imec) in Belgium.
“We tend to think that spiking behaviour is the fundamental level of computation within biological neurons. There are much deeper lying computations going on that we don’t understand – probably down to the quantum level,” says Ilja Ocket, programme manager for Neuromorphic Computing at Imec.
“Even between quantum effects and the high-level behavioural model of a neuron, there are other intermediate functions, such as ion channels and dendritic calculations. The brain is much more complicated than we know. But we’ve already found some aspects we can mimic with today’s technology – and we are already getting a very big payback.”
There is a spectrum of techniques and optimisations that are partially neuromorphic and have already been industrialised. For example, GPU designers are already implementing some of what has been learned from the human brain; and computer designers are already reducing bottlenecks by using multilayer memory stacks. Massive parallelism is another bio-inspired principle used in computers – for example, in deep learning.
Nevertheless, it is very hard for researchers in neuromorphic computers to make inroads in computing because there is already too much momentum around traditional architectures. So rather than try to cause disruption in the computer world, Imec has turned its attention to sensors. Researchers at Imec are looking for ways to “sparsify” data and to exploit that sparsity to accelerate processing in sensors and reduce energy consumption at the same time.
“We focus on sensors that are temporal in nature,” says Ocket. “This includes audio, radar and lidar. It also includes event-based vision, which is a new type of vision sensor that isn’t based on frames but works instead on the principle of your retina. Every pixel independently sends a signal if it senses a significant change in the amount of light it receives.
“We borrowed these ideas and developed new algorithms and new hardware to support these spiking neural networks. Our work now is to demonstrate how low power and low latency this can be when integrated onto a sensor.”
Spiking neural networks on a chip
A neuron accumulates input from all the other neurons it is connected to. When the membrane potential reaches a certain threshold, the axon – the connection coming out of the neuron – emits a spike. This is one of the ways your brain performs computation. And this is what Imec now does on a chip, using spiking neural networks.
“We use digital circuits to emulate the leaky, integrate and fire behaviour of biological spiking neurons,” says Ocket. “They are leaky in the sense that while they integrate, they also lose a bit of voltage on their membrane; they are integrating because they accumulate spikes coming in; and they are firing because the output fires when the membrane potential reaches a certain threshold. We mimic that behaviour.”
The benefit of that mode of operation is that until data changes, no events are generated, and no computations are done in the neural network. Consequently, no energy is used. The sparsity of the spikes within the neural network intrinsically offers low power consumption because computing does not occur constantly.
A spiking neural network is said to be recurrent when it has memory. A spike is not just computed once. Instead, it reverberates into the network, creating a form of memory, which allows the network to recognise temporal patterns, similarly to what the brain does.
Using spiking neural network technology, a sensor transmits tuples that include the X coordinate and the Y coordinate of the pixel that’s spiking, the polarity (whether it’s spiking upward or downward) and the time it spikes. When nothing happens, nothing is transmitted. On the other hand, if things change in a lot of places at once, the sensor creates a lot of events, which becomes a problem because of the size of the tuples.
To minimise this surge in transmission, the sensor does some filtering by deciding on the bandwidth it should output based on the dynamics of the scene. For example, in the case of an event-based camera, if everything in a frame changes, the camera sends too much data. A frame-based system would handle that much better because it has a constant data rate. To overcome this problem, designers put a lot of intelligence on sensors to filter data – one more way of mimicking human biology.
“The retina has 100 million receptors, which is like having 100 million pixels in your eye,” says Ocket. “But the optical fibre that goes through your brain only carries a million channels. So, this means the retina carries out a 100 times compression – and this is real computation. Certain features are detected, like motion from left to right, from top to bottom, or little circles. We are trying to mimic the filtering algorithm that goes on the retina in these event-based sensors, which operate on the edge and feeds data back to a central computer. You might think of the computation going on in the retina as a form of edge AI.”
People have been mimicking spiking neurons in silicon since the 1980s. But the main obstacle preventing this technology from reaching a market or any kind of real application was training spiking neural networks as efficiently and conveniently as deep neural networks are trained. “Once you establish good mathematical understanding and good techniques to train spiking neural networks, the hardware implementation is almost trivial,” says Ocket.
In the past, people would build spiking into their network chips and then do a lot of fine-tuning to get the neural networks to do something useful. Imec took another approach, developing algorithms in software that showed that a given configuration of spiking neurons with a given set of connections would perform to a certain level. Then they built the hardware.
This kind of breakthrough in software and algorithms is unconventional for Imec, where progress is usually in the form of hardware innovation. Something else that was unconventional for Imec was that they did all this work in standard CMOS, which means their technology can be quickly industrialised.
The future impact of neuromorphic computing
“The next direction we’re taking is towards sensor fusion, which is a hot topic in automotive, robotics, drones and other domains,” says Ocket. “A good way of achieving very high-fidelity 3D perception is to combine multiple sensory modalities. Spiking neural networks will allow us to do that with low power and low latency. Our new target is to develop a new chip specifically for sensor fusion in 2023.
“We aim to fuse multiple sensor streams into a coherent and complete 3D representation of the world. Like the brain, we don’t want to have to think about what comes from the camera versus what comes from the radar. We are going for an intrinsically fused representation.
“We’re hoping to show some very relevant demos for the automotive industry – and for robotics and drones across industries – where the performance and the low latency of our technology really shines,” says Ocket. “First we’re looking for breakthroughs in solving certain corner cases in automotive perception or robotics perception that are aren’t possible today because the latency is too high, or the power consumption is too high.”
Two other things Imec expects to happen in the market are the use of event-based cameras and sensor fusion. Event-based cameras have a very high dynamic range and a very high temporal resolution. Sensor fusion might take the form of a single module with cameras in the middle, some radar antennas around it, maybe a lidar, and data is fused on the sensor itself, using spiking neural networks.
But even when the market takes up spiking neural networks in sensors, the larger public may not be aware of the underlying technology. That will probably change when the first event-based camera gets integrated into a smartphone.
“Let’s say you want to use a camera to recognise your hand gestures as a form of human-machine interface,” explains Ocket. “If that were done with a regular camera, it would constantly look at each pixel in each frame. It would snap a frame, and then decide what’s happening in the frame. But with an event-based camera, if nothing is happening in its field of view, no processing is carried out. It has an intrinsic wake-up mechanism that you can exploit to only start computing when there’s sufficient activity coming off your sensor.”
Human-machine interfaces could suddenly become a lot more natural, all thanks to neuromorphic sensing.