Jordana Cepelewicz, Quanta Magazine:
Last month, the artificial intelligence company DeepMind introduced new software that can take a single image of a few objects in a virtual room and, without human guidance, infer what the three-dimensional scene looks like from entirely new vantage points. Given just a handful of such pictures, the system, dubbed the Generative Query Network, or GQN, can successfully model the layout of a simple, video game-style maze.
From the presented image, GQN generates predictions about what a scene should look like — where objects should be located, how shadows should fall against surfaces, which areas should be visible or hidden based on certain perspectives — and uses the differences between those predictions and its actual observations to improve the accuracy of the predictions it will make in the future. “It was the difference between reality and the prediction that enabled the updating of the model,” said Ali Eslami, one of the project’s leaders.
According to Danilo Rezende, Eslami’s co-author and DeepMind colleague, “the algorithm changes the parameters of its [predictive] model in such a way that next time, when it encounters the same situation, it will be less surprised.”
According to this “predictive coding” theory, at each level of a cognitive process, the brain generates models, or beliefs, about what information it should be receiving from the level below it. These beliefs get translated into predictions about what should be experienced in a given situation, providing the best explanation of what’s out there so that the experience will make sense. The predictions then get sent down as feedback to lower-level sensory regions of the brain. The brain compares its predictions with the actual sensory input it receives, “explaining away” whatever differences, or prediction errors, it can by using its internal models to determine likely causes for the discrepancies.
The prediction errors that can’t be explained away get passed up through connections to higher levels (as “feedforward” signals, rather than feedback), where they’re considered newsworthy, something for the system to pay attention to and deal with accordingly. “The game is now about adjusting the internal models, the brain dynamics, so as to suppress prediction error,” said Karl Friston of University College London, a renowned neuroscientist and one of the pioneers of the predictive coding hypothesis.
While the idea that the brain is constantly making inferences (and comparing them to reality) is fairly well-established at this point, proponents of predictive coding have been seeking ways to prove that their particular version of the story is the right one — and that it extends to all of cognition.
That’s where the “Bayesian brain” comes into play, a general framework with roots dating back to the 1860s that flips the traditional model on its head. The theory proposes that the brain makes probabilistic inferences about the world based on an internal model, essentially calculating a “best guess” about how to interpret what it’s perceiving (in line with the rules of Bayesian statistics, which quantifies the probability of an event based on relevant information gleaned from prior experiences). Rather than waiting for sensory information to drive cognition, the brain is always actively constructing hypotheses about how the world works and using them to explain experiences and fill in missing data. That’s why, according to some experts, we might think of perception as “controlled hallucination.”
Part of what makes the predictive coding hypothesis so compelling is its incredible explanatory power. “What I find convincing is how so many things all get accounted for under this story,” said Andy Clark, a professor of logic and metaphysics at the University of Edinburgh and an expert on the theory.
First, it unifies perception and motor control under a single computational process. The two are essentially opposite sides of the same coin: In each case, the brain minimizes prediction errors, but in different ways. With perception, it’s the internal model that gets adjusted; with motor control, it’s the actual environment.
In a study published last year in Neuron, Keller and his colleagues observed the emergence of neurons in the visual system of mice that became predictive over time. It began with an accident, when they set out to train the mice on a video game, only to find that the virtual world had gotten its directions mixed up. Ordinarily — and up until the time of the experiment — the mice saw their field of vision move to the right whenever they turned to the left, and vice versa. But someone had unintentionally flipped the virtual world the researchers used in the study, inverting left and right so that turning leftward meant the mice also experienced vision leftward. The researchers realized that they could capitalize on the accident. They monitored the brain signals that represented this visual flow and found that the signals changed slowly as the mice learned the rules of the inverted environment. “The signals looked like predictions of visual flow to the left,” Keller said.
If the signals had simply been sensory representations of the mouse’s visual experience, they would have flipped immediately in the virtual world. If they had been motor signals, they wouldn’t have flipped at all. Instead, “it is about identifying prediction,” Keller said. “The prediction of visual flow, given movement.”
Similar findings in the parts of the brain that macaques use to process faces were reported around the same time. Previous work had already shown that neurons at lower levels in the network code for orientation-based aspects of a face — by firing at, say, any face in profile. At higher levels, neurons represent the face more abstractly, by paying attention to its identity rather than its position. In the macaque study, the researchers trained monkeys on pairs of faces in which one face, appearing first, always predicted something about the second one. Later, the experimenters interfered with those expectations in specific ways, by showing the same face from a different angle, or an entirely different face. They found prediction errors in lower-level areas of the face processing network, but these errors were associated not with predictions about orientation but with predictions about identity. That is, the errors stemmed from what was going on at higher levels of the system — suggesting that lower levels construct the error signal by comparing incoming perceptions with predictions descending from higher levels.
Some scientists accept that the theory can explain certain aspects of cognition but reject the idea that it could explain everything. Others don’t concede even that much. To David Heeger, a professor of psychology at New York University, it’s important to make a distinction between “predictive coding,” which he says is about transmitting information efficiently, and “predictive processing,” which he defines as prediction-making over time. “There’s a lot of confusion in the literature because these things have been assumed to all be part of the same soup,” he said. “And that’s not necessarily the case, nor is it necessarily the best way to go forward in studying it.” Other types of Bayesian models, for instance, might provide a more accurate description of brain function under certain circumstances.
Predictive coding “is as important to neuroscience as evolution is to biology,” said Lars Muckli, a neurophysiologist at the University of Glasgow who has done extensive work on the theory. But for now, Sprevak noted, “the jury is still out.”