Machine UnLearning: What it is All About?

Written by Archna Oberoi | Apr 1, 2020 12:54:46 PM

An average adult human brain has a storage capacity of around 2.5 petabytes in their memory. With such tremendous capacity, ‘forgetting things’ can be an awful experience for us humans. For example, not being able to recall the password that was updated a day before or unable to remember where the keys were kept the last time.

Retrieval failure, interference, failure to store are some of the reasons why humans forget. And there are times when forgetting is more than just a failure to remember. It’s called motivated forgetting, wherein people tend to forget unwanted memories, consciously or unconsciously.

Basically, forgetting is an active process that helps the brain to dump unnecessary information, take the newer one, and make more meaningful decisions. Inspired by the human brain, data scientists are applying this neuroscience to improve machine learning, giving more power to the AI technology.

The Neuroscience of Unlearning:

The human brain has the ability to filter information. It piles up the data & information received, filter the useful bits, and then clear out the irrelevant details to make a decision. The unused part of the info is deleted to create spaces for the newer ones, just like the running of disk cleanup on a computer.

“If we define this in neurobiology terms, forgetting happens when the synaptic connections between neurons weaken or are eliminated over time. As the new neurons develop, they rewire the circuits of the hippocampus, overwriting the existing memories.”

Why do Machines even need to Unlearn?

We all know, machine learning, as one of the most beneficial use-cases of artificial intelligence. We use tons of data to make a machine learn cognitively and then utilize this learning in decision making. ML (machine learning) makes use of deep learning that utilizes an artificial neural network to learn, just like humans do.

While the machines learn from data, we see a number of ML examples in our day-to-day lives that illustrate the power of learning. Imagine, if making machines learn can make so many things possible, then why must a machine unlearn and why is it important?

Consider this. Once users share their data online, it is difficult for them to revoke access or ask for data deletion. Think of it, you accepted the ‘Privacy Policy’ while signing up for an app and now you have permitted the app to access and share your data & information with the third party. The only option you have is to delete the account in order to come out of the trap.

Machine Learning exacerbates such situations. Once a model is trained with GBs of data, it memorizes and uses every bit of data in its predictions and decision making.

Now think of a successful security attack. An attacker injects some data into the parent data set and corrupts it. Now, the data set will accept this new data and perform actions, which might be different from the expected set of actions.

To deal with such scenarios, the only way is to make the model forget the part of its new learning. But, making a model unlearn is not an easy task. Here are a few practices that data scientists adopt in order to make a machine learning model unlearn.

Elastic Weight Consolidation (EWC)

When a neural network is trained on a particular task, its parameters are adapted to solve or act for that task. Now, when a new task is introduced, the new parameters and adaptations overwrite the previously acquired learning of the neural network. This phenomenon of overwriting is known as ‘catastrophic forgetting’ in cognitive science and is considered as one of the significant limitations of neural networks.

Contrary to neural networks, the human brain learns incrementally. It acquires skills at a time and then applies the previously learned skills and knowledge when learning the new tasks. To make this possible in neural networks, Google’s DeepMind researchers in 2017 introduced the Elastic Weight Consolidation (EWC) algorithm that mimics the neuroscience processes called synaptic consolidation.

According to neuroscience, there are two kinds of consolidation that occur in the human brain, one is the ‘systems consolidation’ and another is ‘synaptic consolidation’. In systems consolidation, the memories acquired by quick-learning parts of the brain are imprinted on the slow-learning parts. This can happen during a conscious or unconscious mind, for example, during a dream. On the other hand, in synaptic consolidation, the connections between the neurons are less likely to be overwritten if they have a high significance in previously learned tasks.

In neural networks, multiple neurons are used to perform a task. The Elastic Weight Consolidation (EWC) algorithm codes some neural connections as critical, thereby protecting information to be overwritten or forgotten.

An illustration of the learning process for two tasks using EWC

Bottleneck Theory

In 2017, a computer scientist and neuroscientist from the Hebrew University of Jerusalem, Naftali Tishby presented evidence for the fact that deep neural networks can learn according to the ‘information bottleneck’ theory.

“It says that the network gets rid of the noisy input data by squeezing the input through a bottleneck, retaining only the features most relevant to general concepts.”

According to Tishby, there are two phases of learning for a neural network- fitting and compressing. During the fitting phase, the network labels its training data, and during compression, it sheds the information about data and keeps a track of only the strongest features. So, compressing is actually a strategic approach to forgetting.

Long Short-Term Memory Networks (LSTM)

LSTMs are a type of recurrent neural network that uses a mechanism to decide which pieces of information need to be remembered, updated, or paid attention.

Consider this situation. You have been watching a TV series for the last month. Your long term memory says that the series showcases a wide range of animals, in different locations from around the world. There is also a short term memory that represents recent scenes in the show. Then there is a current event which is the image of what has been shown, i.e. an image of a dog which could also be a wild animal of its family, say a fox or a coyote.

If we combine these three things, it is possible to form a prediction. The long term memory says that the TV series is about wildlife, which, when combined with the short term memory and current event, predict that the animal in the show is a coyote and not a dog. The long short-term memory aids the neural networks to forget irrelevant information, save what’s important to remember, and determines which part needs to be focused at a given point of time.

Making Machine Learning Models to Unlearn: How to get started?

For different ML models, unlearning can serve different purposes. If you think that it's high time for your ML model to unlearn, then connect with our AI experts to get started with it.

Our team, who dedicatedly performs research & development on AI trends and implements them, will help you out to select the right approach for your ML model to unlearn.

View full post