While you likely have at least a vague sense that artificial intelligence is the future of the devices and services around you, you may or may not know much about the specific technologies that enable machines to process data and react intelligently — such as by recognizing objects or translating speech in real-time. The concept of deep learning, a technology that MIT Technology Review reports “attempts to mimic the activity in layers of neurons in the neocortex, the wrinkly 80 percent of the brain where thinking occurs” — can be especially difficult to wrap your head around. Curious about deep learning, and what you need to know about it? Here’s exactly how the stuff of science fiction films is coming to life.
Deep learning software learns to recognize patterns in digital representations of sounds, images, and other data. Robert D. Hof reports for Technology Review that the basic idea that software can simulate the neocortex’s large array of neurons in an artificial neural network “is decades old, and it has led to as many disappointments as breakthroughs.” The term “neural network,” for instance, originated in early efforts to create a system that could emulate the way that the human brain’s individual neurons work together to solve a problem, and while most computer scientists have moved away from the comparisons to the human brain, they’ve stuck with the idea of connecting simple elements that can work together to solve complex problems.
Because of recent improvements in mathematical formulas, and the increasing power of computers, researchers can now model many more layers of virtual neurons than they’ve been able to in the past. That greater depth enables more advanced image and speech recognition. For instance, a Google deep learning system that had been shown 10 million images from YouTube videos was demonstrated to be almost twice as good as any previous image recognition software at identifying objects (such as cats). Google used the same technology to reduce the error rate of the speech recognition features in its latest version of Android.
Bob O’Donnell reports for Recode that at the simplest level, many of the current efforts around deep learning “involve very rapid recognition and classification of objects — whether visual, audible or some other form of digital data.” He explains, “Using cameras, microphones and other types of sensors, data is input into a system that contains a multi-level set of filters that provide increasingly detailed levels of differentiation. Think of it like the animal or plant classification charts from your grammar school days: Kingdom, Phylum, Class, Order, Family, Genus, Species.”
The trick is to get the machine to learn the characteristics of these different classification levels, and then use that learning to accurately classify a new object. The term “deep learning” itself refers to the depth of filtering or classification levels — usually 10 or more — used to recognize an object. O’Donnell explains:
In other words, while computers have been able to identify things they’ve seen before, learning to recognize that a new image is not just a dog, but a long-haired miniature dachshund after they’ve “seen” enough pictures of dogs is a critical capability. Actually, what’s really important — and really new — is the ability to do this extremely rapidly and accurately.
Further, there are two critical steps involved in the process of deep learning. The first is to complete extensive analysis of giant data sets and to automatically generate rules or algorithms that can accurately describe the characteristics of different objects. (That happens offline, in large data centers that use a variety of different computing architectures.) The second step is to use those rules to identify objects or situations based on real-time data, a process that happens on devices that accept live data input.
O’Donnell notes that companies are just beginning to talk about bringing deep learning and artificial intelligence to a variety of devices, but there’s actually little to no new “learning” going on on those devices. The software, in those cases, is simply focused on being able to recognize the objects, situations, and data points that they’re preprogrammed to look for (based on the rules or algorithms with which they’re equipped for a particular application).
Nonetheless, O’Donnell notes that “this is an enormously difficult task because of the need to run the multiple layers of a convolutional neural network in real time.” He posits that iterations of artificial intelligence and deep learning accelerators “will likely be able to bring some of the offline ‘rule creating’ mechanisms onboard so that objects equipped with these components will be able to get smarter over time. Of course, it’s also possible to update the algorithms on existing devices in order to achieve a similar result.”
Hof reports for Technology Review that extending deep learning to applications beyond speech and image recognition will require “more conceptual and software breakthroughs, not to mention many more advances in processing power. And we probably won’t see machines we all agree can think for themselves for years, perhaps decades—if ever.” And for all of the recent advances in deep learning, some critics doubt that it will be deep learning that moves artificial intelligence toward something that rivals human intelligence. They think that deep learning (and artificial intelligence in general) ignore too much of the brain’s biology in favor of brute-force computing.
But the amount of computing resources that companies like Google are throwing at the development of deep learning can’t be dismissed. The more sophisticated image recognition that deep learning would enable could improve YouTube or Google’s self-driving cars. Deep learning is also likely to be applied to areas like drug discovery, machine vision, or even the prediction of medical problems or traffic jams. Deep learning won’t be able to solve all of the challenges facing artificial intelligence, but it’s going to be an important piece of the devices and services we’ll all be using in the future.