Differential geometric framework for constructing FIPs in weight space. a, Left: conventional training on a task finds a single trained network (w_t) solution. Right: the FIP strategy discovers a submanifold of isoperformance networks (w₁, w₂…w_N) for a task of interest, enabling the efficient search for networks endowed with adversarial robustness (w₂), sparse networks with high task performance (w₃) and for learning multiple tasks without forgetting (w₄). b, Top: a trained CNN with weight configuration (w_t), represented by lines connecting different layers of the network, accepts an input image x and produces a ten-element output vector, f(x, w_t). Bottom: perturbation of network weights by dw results in a new network with weight configuration w_t + dw with an altered output vector, f(x, w_t + dw), for the same input, x. c, The FIP algorithm identifies weight perturbations θ* that minimize the distance moved in output space and maximize alignment with the gradient of a secondary objective function (∇_wL). The light-blue arrow indicates an ϵ-norm weight perturbation that minimizes distance moved in output space and the dark-blue arrow indicates an ϵ-norm weight perturbation that maximizes alignment with the gradient of the objective function, L(x, w). The secondary objective function L(x, w) is varied to solve distinct machine learning challenges. d, Path sampling algorithm defines FIPs, γ(t), through the iterative identification of ϵ-norm perturbations (θ*(t)) in the weight space. Credit: Nature Machine Intelligence (2024). DOI: 10.1038/s42256-024-00902-x

Overcoming 'catastrophic forgetting': Algorithm inspired by brain allows neural networks to retain knowledge

09 Oct 2024, 19:14 by California Institute of Technology · Tech Xplore

Neural networks have a remarkable ability to learn specific tasks, such as identifying handwritten digits. However, these models often experience "catastrophic forgetting" when taught additional tasks: They can successfully learn the new assignments, but "forget" how to complete the original. For many artificial neural networks, like those that guide self-driving cars, learning additional tasks thus requires being fully reprogrammed.

Biological brains, on the other hand, are remarkably flexible. Humans and animals can easily learn how to play a new game, for instance, without having to re-learn how to walk and talk.

Inspired by the flexibility of human and animal brains, Caltech researchers have now developed a new type of algorithm that enables neural networks to be continuously updated with new data that they are able to learn from without having to start from scratch. The algorithm, called a functionally invariant path (FIP) algorithm, has wide-ranging applications from improving recommendations on online stores to fine-tuning self-driving cars.

The algorithm was developed in the laboratory of Matt Thomson, assistant professor of computational biology and a Heritage Medical Research Institute (HMRI) Investigator. The research is described in a new study appearing in the journal Nature Machine Intelligence.

Thomson and former graduate student Guru Raghavan, Ph.D. were inspired by neuroscience research at Caltech, particularly in the laboratory of Carlos Lois, Research Professor of Biology. Lois studies how birds can rewire their brains to learn how to sing again after a brain injury. Humans can do this too; people who have experienced brain damage from a stroke, for instance, can often forge new neural connections to learn everyday functions again.

"This was a yearslong project that started with the basic science of how brains flexibly learn," says Thomson. "How do we give this capability to artificial neural networks?"

The team developed the FIP algorithm using a mathematical technique called differential geometry. The framework allows a neural network to be modified without losing previously encoded information.

In 2022, with guidance from Julie Schoenfeld, Caltech Entrepreneur In Residence, Raghavan and Thomson started a company called Yurts to further develop the FIP algorithm and deploy machine learning systems at scale to address many different problems. Raghavan co-founded Yurts with industry professionals Ben Van Roo and Jason Schnitzer.

Raghavan is the study's first author. In addition to Raghavan and Thomson, Caltech co-authors are graduate students Surya Narayanan Hari and Shichen Rex Liu, and collaborator Dhruvil Satani. Bahey Tharwat of Alexandria University in Egypt is also a co-author. Thomson is an affiliated faculty member with the Tianqiao and Chrissy Chen Institute for Neuroscience at Caltech.

More information: Guruprasad Raghavan et al, Engineering flexible machine learning systems by traversing functionally invariant paths, Nature Machine Intelligence (2024). DOI: 10.1038/s42256-024-00902-x
Journal information: Nature Machine Intelligence

Provided by California Institute of Technology