DeepMind's Algorithm Shows the Reward Mechanisms Within Our Brains
Tue, April 20, 2021

DeepMind's Algorithm Shows the Reward Mechanisms Within Our Brains

DeepMind's algorithm proves that our brain uses distributional reward predictions to strengthen its learning algorithm / Credits: vampy1 via 123RF


In previous studies, researchers revealed that dopamine produced in the brain is involved in reward processing. Dopamine is produced when something good happens, which gives people pleasure. Some research also suggested that the neurons in the brain that respond to the presence of dopamine all respond in the same ways – which researchers from DeepMind, University College and Harvard University have agreed with. 

Recently, a study published in the journal Nature showed that lessons learned in applying learning techniques to AI systems may help in explaining how reward pathways work in the brain. The researchers believed that the distributional reinforcement learning, a type of machine learning based on reinforcement, changes the way it predicts rewards. This research could potentially improve our understanding of mental health and motivation and validate the current direction of AI research. 

According to MIT Technology Review, an online site that aims to bring about better-informed and more conscious decisions about technology through authoritative, influential, and trustworthy journalism, the researchers tested the theory by observing dopamine neuron behavior in mice. They discovered that every neuron released different amounts of dopamine, which predicted different outcomes. The findings showed that our brain indeed uses distributional reward predictions to strengthen its learning algorithm.

Also, the study showed implications for both AI and neuroscience such as it validates distributional reinforcement learning as a promising path to more advanced AI capabilities and it could offer an important update to one of the canonical theories in neuroscience about reward systems in the brain. 

“This is a nice extension to the notion of dopamine coding of reward prediction error. It is amazing how this very simple dopamine response predictably follows intuitive patterns of basic biological learning processes that are now becoming a component of AI,” Wolfram Schultz, a pioneer in dopamine neuron behavior who wasn’t involved in the study, said.