Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3503161.3548054acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Class Gradient Projection For Continual Learning

Published: 10 October 2022 Publication History

Abstract

Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL). Recent approaches tackle this problem by projecting the gradient update orthogonal to the gradient subspace of existing tasks. While the results are remarkable, those approaches ignore the fact that these calculated gradients are not guaranteed to be orthogonal to the gradient subspace of each class due to the class deviation in tasks, e.g., distinguishing "Man" from "Sea" v.s. differentiating "Boy" from "Girl". Therefore, this strategy may still cause catastrophic forgetting for some classes. In this paper, we propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks. Gradient update orthogonal to the gradient subspace of existing classes can be effectively utilized to minimize interference from other classes. To improve the generalization and efficiency, we further design a Base Refining (BR) algorithm to combine similar classes and refine class bases dynamically. Moreover, we leverage a contrastive learning method to improve the model's ability to handle unseen tasks. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed approach. It improves the previous methods by 2.0% on the CIFAR-100 dataset. The code is available at https://github.com/zackschen/CGP.

References

[1]
Davide Abati, Jakub Tomczak, Tijmen Blankevoort, Simone Calderara, Rita Cucchiara, and Babak Ehteshami Bejnordi. 2020. Conditional channel gated networks for task-aware continual learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 3931--3940.
[2]
Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision. 139--154.
[3]
Ali Ayub and Alan R. Wagner. 2021. EEC: Learning to Encode and Regenerate Images for Continual Learning. In Proceedings of the International Conference on Learning Representations.
[4]
Yaroslav Bulatov. 2011. Notmnist dataset. Google (Books/OCR), Tech. Rep.[Online]. Available: http://yaroslavvb. blogspot. it/2011/09/notmnist-dataset. html, Vol. 2 (2011).
[5]
Yuanqiang Cai, Dawei Du, Libo Zhang, Longyin Wen, Weiqiang Wang, Yanjun Wu, and Siwei Lyu. 2020. Guided Attention Network for Object Detection and Counting on Drones. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12--16, 2020. 709--717.
[6]
Hyuntak Cha, Jaeho Lee, and Jinwoo Shin. 2021. Co2L: Contrastive Continual Learning. In Proceedings of the International Conference on Computer Vision. 9516--9525.
[7]
Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. 2018. Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420 (2018).
[8]
Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet Kumar Dokania, Philip H. S. Torr, and Marc'Aurelio Ranzato. 2019. Continual Learning with Tiny Episodic Memories. CoRR, Vol. abs/1902.10486 (2019).
[9]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International conference on machine learning. 1597--1607.
[10]
Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 15750--15758.
[11]
Matthias Delange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Greg Slabaugh, and Tinne Tuytelaars. 2021. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
[12]
Ruoxi Deng and Shengjun Liu. 2020. Deep Structural Contour Detection. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12--16, 2020. 304--312.
[13]
Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, and Marcus Rohrbach. 2020. Adversarial continual learning. In Proceedings of the European Conference on Computer Vision. 386--402.
[14]
Mehrdad Farajtabar, Navid Azizan, Alex Mott, and Ang Li. 2020. Orthogonal gradient descent for continual learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 3762--3773.
[15]
Michael Gutmann and Aapo Hyv"arinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 297--304.
[16]
Dan Hendrycks, Norman Mu, Ekin Dogus Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. 2020. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty. In Proceedings of the International Conference on Learning Representations.
[17]
Ching-Yi Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, and Chu-Song Chen. 2019. Compacting, picking and growing for unforgetting continual learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[18]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 18661--18673.
[19]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, Vol. 114 (2017), 3521--3526.
[20]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[21]
Seung Hyun Lee, Dae Ha Kim, and Byung Cheol Song. 2018. Self-supervised knowledge distillation using singular value decomposition. In Proceedings of the European Conference on Computer Vision. 335--350.
[22]
Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Overcoming catastrophic forgetting by incremental moment matching. Advances in neural information processing systems, Vol. 30 (2017).
[23]
Junnan Li, Pan Zhou, Caiming Xiong, and Steven CH Hoi. 2020b. Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966 (2020).
[24]
Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, and Raymond Huang. 2020a. Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12--16, 2020. 238--246.
[25]
Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, and Caiming Xiong. 2019. Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In Proceedings of the International Conference on Machine Learning. 3925--3934.
[26]
Sen Lin, Li Yang, Deliang Fan, and Junshan Zhang. 2022. TRGP: Trust Region Gradient Projection for Continual Learning. CoRR, Vol. abs/2202.02931 (2022).
[27]
David Lopez-Paz and Marc'Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. Advances in neural information processing systems, Vol. 30 (2017).
[28]
Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Vol. 24. 109--165.
[29]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.
[30]
Cuong V Nguyen, Yingzhen Li, Thang D Bui, and Richard E Turner. 2017. Variational continual learning. arXiv preprint arXiv:1710.10628 (2017).
[31]
Jathushan Rajasegaran, Munawar Hayat, Salman H Khan, Fahad Shahbaz Khan, and Ling Shao. 2019. Random path selection for continual learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[32]
Roger Ratcliff. 1990. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological review, Vol. 97 (1990), 285.
[33]
Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. 2017. icarl: Incremental classifier and representation learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 2001--2010.
[34]
Andrei A. Rusu, Neil C. Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Progressive Neural Networks. CoRR, Vol. abs/1606.04671 (2016).
[35]
Gobinda Saha, Isha Garg, and Kaushik Roy. 2021. Gradient Projection Memory for Continual Learning. In Proceedings of the International Conference on Learning Representations.
[36]
Joan Serra, Didac Suris, Marius Miron, and Alexandros Karatzoglou. 2018. Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the International Conference on Machine Learning. 4548--4557.
[37]
Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. 2020. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. CoRR, Vol. abs/2001.07685 (2020).
[38]
Pablo Sprechmann, Siddhant M. Jayakumar, Jack W. Rae, Alexander Pritzel, Adrià Puigdomè nech Badia, Benigno Uria, Oriol Vinyals, Demis Hassabis, Razvan Pascanu, and Charles Blundell. 2018. Memory-based Parameter Adaptation. In Proceedings of the International Conference on Learning Representations.
[39]
A"a ron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR, Vol. abs/1807.03748 (2018).
[40]
Tom Veniat, Ludovic Denoyer, and Marc'Aurelio Ranzato. 2021. Efficient Continual Learning with Modular Networks and Task-Driven Priors. In Proceedings of the International Conference on Learning Representations.
[41]
Xin Wang, Wei Huang, Qi Liu, Yu Yin, Zhenya Huang, Le Wu, Jianhui Ma, and Xue Wang. 2020. Fine-Grained Similarity Measurement between Educational Videos and Exercises. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12--16, 2020. 331--339.
[42]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).
[43]
Jaehong Yoon, Saehoon Kim, Eunho Yang, and Sung Ju Hwang. 2020. Scalable and Order-robust Continual Learning with Additive Parameter Decomposition. In Proceedings of the International Conference on Learning Representations.
[44]
Guanxiong Zeng, Yang Chen, Bo Cui, and Shan Yu. 2019. Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, Vol. 1 (2019), 364--372.
[45]
Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. In Proceedings of the International Conference on Machine Learning. 3987--3995.
[46]
Ji Zhang, Jingkuan Song, Lianli Gao, Ye Liu, and Heng Tao Shen. 2022. Progressive Meta-learning with Curriculum. IEEE Transactions on Circuits and Systems for Video Technology (2022).
[47]
Ji Zhang, Jingkuan Song, Yazhou Yao, and Lianli Gao. 2021. Curriculum-Based Meta-learning. In MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. 1838--1846.io

Cited By

View all
  • (2024)Generating Prompts in Latent Space for Rehearsal-free Continual LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681003(8913-8922)Online publication date: 28-Oct-2024
  • (2024)Avoiding Catastrophic Forgetting Via Neuronal Decay2024 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)10.1109/WECONF61770.2024.10564665(1-6)Online publication date: 3-Jun-2024
  • (2023)CUCL: Codebook for Unsupervised Continual LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611713(1729-1737)Online publication date: 27-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. continual learning
  2. contrastive learning
  3. gradient projection
  4. lifelong learning

Qualifiers

  • Research-article

Funding Sources

  • Chinese National Science & Technology Pillar Program
  • the National Natural Science Foundation of China

Conference

MM '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)106
  • Downloads (Last 6 weeks)10
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Generating Prompts in Latent Space for Rehearsal-free Continual LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681003(8913-8922)Online publication date: 28-Oct-2024
  • (2024)Avoiding Catastrophic Forgetting Via Neuronal Decay2024 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)10.1109/WECONF61770.2024.10564665(1-6)Online publication date: 3-Jun-2024
  • (2023)CUCL: Codebook for Unsupervised Continual LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611713(1729-1737)Online publication date: 27-Oct-2023
  • (2023)Progressive Neural Networks for Continuous Classification of Retinal Optical Coherence Tomography Images2023 Eleventh International Conference on Advanced Cloud and Big Data (CBD)10.1109/CBD63341.2023.00036(156-161)Online publication date: 18-Dec-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media