Learning to Forget: Continual Prediction with LSTM

Article Properties
  • Language
    English
  • Publication Date
    2000/10/01
  • Indian UGC (Journal)
  • Refrences
    14
  • Citations
    2,066
  • Felix A. Gers IDSIA, 6900 Lugano, Switzerland
  • Jürgen Schmidhuber IDSIA, 6900 Lugano, Switzerland
  • Fred Cummins IDSIA, 6900 Lugano, Switzerland
Abstract
Cite
Gers, Felix A., et al. “Learning to Forget: Continual Prediction With LSTM”. Neural Computation, vol. 12, no. 10, 2000, pp. 2451-7, https://doi.org/10.1162/089976600300015015.
Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to Forget: Continual Prediction with LSTM. Neural Computation, 12(10), 2451-2471. https://doi.org/10.1162/089976600300015015
Gers FA, Schmidhuber J, Cummins F. Learning to Forget: Continual Prediction with LSTM. Neural Computation. 2000;12(10):2451-7.
Journal Categories
Medicine
Internal medicine
Neurosciences
Biological psychiatry
Neuropsychiatry
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Technology
Electrical engineering
Electronics
Nuclear engineering
Electronics
Technology
Mechanical engineering and machinery
Description

How can long short-term memory (LSTM) networks overcome limitations in processing continuous input streams? This paper addresses the challenge of LSTM networks, which, without explicit resets, can experience state growth that causes the network to falter. The solution is a novel, adaptive 'forget gate' that allows LSTM cells to learn when to reset, freeing up internal resources. The research revisits benchmark problems where standard LSTM outperforms other recurrent neural network algorithms. However, these algorithms, including LSTM, struggle with continual versions of these problems. By implementing forget gates, LSTM can effectively solve these challenges in an elegant manner. This innovation enhances the capacity of LSTM networks to manage long, uninterrupted sequences, improving their applicability in real-world scenarios requiring continual learning. The concept provides a path forward for continual learning in neural networks.

Published in Neural Computation, this research fits within the journal's focus on neural networks and learning algorithms. The development of a novel forget gate for LSTM networks contributes to the ongoing advancements in recurrent neural networks, aligning with the journal's scope.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled 10.1162/153244303768966139 and was published in 2000. The most recent citation comes from a 2024 study titled 10.1162/153244303768966139 . This article reached its peak citation in 2023 , with 442 citations.It has been cited in 866 different journals, 22% of which are open access. Among related journals, the IEEE Access cited this research the most, with 97 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year