IN MARCH, YOSHUA Bengio received a share of the Turing Award, the highest accolade in computer science, for contributions to the development of deep learning—the technique that triggered a renaissance in artificial intelligence, leading to advances in self-driving cars, real-time speech translation, and facial recognition.
Now, Bengio says deep learning needs to be fixed. He believes it won’t realize its full potential, and won’t deliver a true AI revolution, until it can go beyond pattern recognition and learn more about cause and effect. In other words, he says, deep learning needs to start asking why things happen.
The 55-year-old professor at the University of Montreal, who sports bushy gray hair and eyebrows, says deep learning works well in idealized situations but won’t come close to replicating human intelligence without being able to reason about causal relationships. “It’s a big thing to integrate [causality] into AI,” Bengio says. “Current approaches to machine learning assume that the trained AI system will be applied on the same kind of data as the training data. In real life it is often not the case.”
Machine learning systems including deep learning are highly specific, trained for a particular task, like recognizing cats in images, or spoken commands in audio. Since bursting onto the scene around 2012, deep learning has demonstrated a particularly impressive ability to recognize patterns in data; it’s been put to many practical uses, from spotting signs of cancer in medical scans to uncovering fraud in financial data.
But deep learning is fundamentally blind to cause and effect. Unlike a real doctor, a deep learning algorithm cannot explain why a particular image may suggest disease. This means deep learning must be used cautiously in critical situations.
Understanding cause and effect would make existing AI systems smarter and more efficient. A robot that understands that dropping things causes them to break would not need to toss dozens of vases onto the floor to see what happens to them.
Bengio says the analogy extends to self driving cars. “Humans don’t need to live through many examples of accidents to drive prudently,” he says. They can just imagine accidents, “in order to prepare mentally if it did actually happen.”
The question is how to give AI systems this ability.
At his research lab, Bengio is working on a version of deep learning capable of recognizing simple cause-and-effect relationships. He and colleagues recently posted a research paper outlining the approach. They used a dataset that maps causal relationships between real-world phenomena, such as smoking and lung cancer, in terms of probabilities. They also generated synthetic datasets of causal relationships.
The algorithm in the paper essentially forms a hypothesis about which variables are causally related, and then tests how changes to different variables fit the theory. The fact that smoking is not only related to cancer but actually causes it, for instance, should still be apparent even if cancer is correlated with other factors, such as hospital visits.
A robot might eventually use this approach to form a hypothesis about what happens when it drops something, and then confirm its hunch when it sees several things smash to the floor.
Bengio has already transformed AI once. Over the past several decades, he helped develop the ideas and engineering techniques that unleashed the potential of deep learning, together with this year’s other Turing Award recipients: Geoffrey Hinton, of the University of Toronto and Google, and Yann LeCun, who works at NYU and Facebook.
Deep learning uses artificial neural networks to mathematically approximate the way human neurons and synapses learn by forming and strengthening connections. Training data, such as images or audio, are fed to a neural network, which is gradually adjusted until it responds in the correct way. A deep learning program can be trained to recognize objects in photographs with high accuracy, providing it sees lots of training images and is given plenty of computing power.
But deep learning algorithms aren’t good at generalizing, or taking what they’ve learned from one context and applying it to another. They also capture phenomena that are correlated—like the rooster crowing and the sun coming up—without regard to which causes the other.
Causality has long been studied in other areas, and mathematical techniques have emerged in recent decades for exploring causal relationships, helping to revolutionize the study of fields including social science, economics, and epidemiology. A small group of researchers is working to combine causality and machine learning.
Judea Pearl, who won the Turing Award in 2011 for his work on causal reasoning, says he is impressed with Bengio’s ideas, although he has not studied them closely. A recent book co-authored by Pearl, The Book of Why: The New Science of Cause and Effect, makes the case that AI will be fundamentally limited without some sort of causal reasoning ability.