While modern machine learning systems act with a semblance of artificial intelligence, the truth is they don’t “understand” any of the data they work with — which in turn means they tend to store even trivial items forever. Facebook researchers have proposed structured forgetfulness as a way for AI to clear the decks a bit, improving their performance and inching that much closer to how a human mind works.
The researchers describe the problem by explaining how humans and AI agents might approach a similar problem.
Say there are ten doors of various colors. You’re asked to go through the yellow one, you do so and then a few minutes later have forgotten the colors of the other doors — because it was never important that two were red, one plaid, two walnut, etc, only that they weren’t yellow and that the one you chose was. Your brain discarded that information almost immediately.
But an AI might very well have kept the colors and locations of the other nine doors in its memory. That’s because it doesn’t understand the problem or the data intuitively — so it keeps all the information it used to make its decision.
This isn’t an issue when you’re talking about relatively small amounts of data, but machine learning algorithms, especially during training, now routinely handle millions of data points and ingest terabytes of imagery or language. And because they’re built to constantly compare new data with their accrued knowledge, failing to forget unimportant things means they’re bogged down by constant references to pointless or outdated data points.
The solution hit upon by Facebook researchers is essentially — and wouldn’t we all like to have this ability — to tell itself how long it needs to remember a piece of data when it evaluates it to begin with.
“Each individual memory is associated with a predicted expiration date, and the scale of the memory depends on the task,” explained Angela Fan, a Facebook AI researcher who worked on the Expire-Span paper. “The amount of time memories are held depends on the needs of the task—it can be for a few steps or until the task is complete.”
So in the case of the doors, the colors of the non-yellow doors are plenty important until you find the yellow one. At that point it’s safe to forget the rest, though of course depending on how many other doors need to be checked, the memory could be held for various amounts of time. (A more realistic example might be forgetting faces that aren’t the one the system is looking for, once it finds it.)
Analyzing a long piece of text, the memory of certain words or phrases might matter until the end of a sentence, a paragraph, or longer — it depends on whether the agent is trying to determine who’s speaking, what chapter the sentence belongs to, or what genre the story is.
This improves performance because at the end, there’s simply less information for the model to sort through. Because the system doesn’t know whether the other doors might be important, that information is kept ready at hand, increasing the size and decreasing the speed of the model.
Fan said the models trained using Expire-Span performed better and were more efficient, taking up less memory and compute time. That’s important during training and testing, which can take up thousands of hours of processing, meaning even a small improvement is considerable, but also at the end user level, where the same task takes less power and happens faster. Suddenly performing an operation on a photo makes sense to do live rather than after the fact.
Though being able to forget does in some ways bring AI processes closer to human cognition, it’s still nowhere near the intuitive and subtle ways our minds operate. Of course, being able to pick what to remember and how long is a major advantage over those of us for whom those parameters are chosen seemingly randomly.