Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Learning long-term dependencies with gradient descent is difficult

Published: 01 March 1994 Publication History

Abstract

Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered

Cited By

View all
  1. Learning long-term dependencies with gradient descent is difficult

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Neural Networks
    IEEE Transactions on Neural Networks  Volume 5, Issue 2
    March 1994
    185 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 March 1994

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)SALSTM: segmented self-attention long short-term memory for long-term forecastingThe Journal of Supercomputing10.1007/s11227-024-06493-z81:1Online publication date: 1-Jan-2025
    • (2024)Analyzing E-Commerce Market Data Using Deep Learning Techniques to Predict Industry TrendsJournal of Organizational and End User Computing10.4018/JOEUC.34209336:1(1-22)Online publication date: 16-Apr-2024
    • (2024)GT-CHESIntelligent Data Analysis10.3233/IDA-23019428:3(699-715)Online publication date: 1-Jan-2024
    • (2024)Subseasonal Prediction of Summer Temperature in West Africa Using Artificial IntelligenceInternational Journal of Intelligent Systems10.1155/2024/88692672024Online publication date: 1-Jan-2024
    • (2024)Document Information ExtractionApplied Computational Intelligence and Soft Computing10.1155/2024/75994152024Online publication date: 1-Jan-2024
    • (2024)Shrinkage Initialization for Smooth Learning of Neural NetworksProceedings of the 2024 9th International Conference on Big Data and Computing10.1145/3695220.3695226(52-57)Online publication date: 24-May-2024
    • (2024)Exploring the Characteristics of Time Series Data Affecting Forecasting Errors Using LSTMProceedings of the 2024 9th International Conference on Big Data and Computing10.1145/3695220.3695224(8-13)Online publication date: 24-May-2024
    • (2024)Simulation and prediction of a reservoir operation mode based on Long Short-Term MemoryProceedings of the 2024 International Academic Conference on Edge Computing, Parallel and Distributed Computing10.1145/3677404.3677445(234-242)Online publication date: 19-Apr-2024
    • (2024)Creativity and Machine Learning: A SurveyACM Computing Surveys10.1145/366459556:11(1-41)Online publication date: 28-Jun-2024
    • (2024)Neural Methods for Data-to-text GenerationACM Transactions on Intelligent Systems and Technology10.1145/366063915:5(1-46)Online publication date: 8-May-2024
    • Show More Cited By

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media