Mousavi et al., 2017 - Google Patents
Traffic light control using deep policy‐gradient and value‐function‐based reinforcement learningMousavi et al., 2017
View PDF- Document ID
- 15577450001624371510
- Author
- Mousavi S
- Schukat M
- Howley E
- Publication year
- Publication venue
- IET Intelligent Transport Systems
External Links
Snippet
Recent advances in combining deep neural network architectures with reinforcement learning (RL) techniques have shown promising potential results in solving complex control problems with high‐dimensional state and action spaces. Inspired by these successes, in …
- 230000002787 reinforcement 0 title abstract description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation, e.g. linear programming, "travelling salesman problem" or "cutting stock problem"
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mousavi et al. | Traffic light control using deep policy‐gradient and value‐function‐based reinforcement learning | |
Wang et al. | Deep reinforcement learning for transportation network combinatorial optimization: A survey | |
Cai et al. | Proxylessnas: Direct neural architecture search on target task and hardware | |
Tong et al. | Directed graph contrastive learning | |
Wang et al. | Interpretable decision-making for autonomous vehicles at highway on-ramps with latent space reinforcement learning | |
CN113362491B (en) | Vehicle track prediction and driving behavior analysis method | |
Wei et al. | Learning motion rules from real data: Neural network for crowd simulation | |
Balhara et al. | A survey on deep reinforcement learning architectures, applications and emerging trends | |
Genders et al. | Policy analysis of adaptive traffic signal control using reinforcement learning | |
CN113326884B (en) | Efficient learning method and device for large-scale heterograph node representation | |
Liu et al. | Smart city moving target tracking algorithm based on quantum genetic and particle filter | |
Wang et al. | TransWorldNG: Traffic simulation via foundation model | |
Liu et al. | Graph convolution-based deep reinforcement learning for multi-agent decision-making in interactive traffic scenarios | |
Kumar et al. | Adaptive traffic light control using deep reinforcement learning technique | |
Huang et al. | Improving traffic signal control operations using proximal policy optimization | |
Jiang et al. | A general scenario-agnostic reinforcement learning for traffic signal control | |
Quek et al. | Deep Q‐network implementation for simulated autonomous vehicle control | |
Xu et al. | Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey | |
Chen et al. | Deep Q-learning with hybrid quantum neural network on solving maze problems | |
Zhang et al. | A Survey of Generative Techniques for Spatial-Temporal Data Mining | |
Liu et al. | Graph convolution-based deep reinforcement learning for multi-agent decision-making in mixed traffic environments | |
Hu et al. | Dynamic traffic signal control using mean field multi‐agent reinforcement learning in large scale road‐networks | |
US20230289563A1 (en) | Multi-node neural network constructed from pre-trained small networks | |
CN116484016B (en) | Time sequence knowledge graph reasoning method and system based on automatic maintenance of time sequence path | |
Gora et al. | Investigating performance of neural networks and gradient boosting models approximating microscopic traffic simulations in traffic optimization tasks |