19.7 Catastrophic Forgetting and Continual Learning
Right, let’s talk about the elephant in the neural network: catastrophic forgetting. It’s the infuriating phenomenon where you spend days carefully fine-tuning your model on a new, exciting task, only to discover it has the memory of a goldfish that just got hit on the head. It’s completely forgotten how to do its original job. Poof. Gone. Think of it this way: you painstakingly teach a neural network to be a world-class expert on identifying dog breeds. You then want it to also learn about cats. So you give it a dataset of cats. The network, being an obliging but terribly literal student, goes, “Ah, I see! We are optimizing for cats now! To make room for this new ‘cat’ knowledge, I shall simply overwrite these seemingly unimportant ‘dog’ weights.” And just like that, your world-class dog breed classifier is now merely a mediocre cat detector. That’s catastrophic forgetting in a nutshell. It’s the model’s tendency to overwrite previously learned knowledge (the weights crucial for task A) when it’s trained on new data (for task B).