Chebotar et al., 2017 - Google Patents
Path integral guided policy searchChebotar et al., 2017
View PDF- Document ID
- 2788940950755169416
- Author
- Chebotar Y
- Kalakrishnan M
- Yahya A
- Li A
- Schaal S
- Levine S
- Publication year
- Publication venue
- 2017 IEEE international conference on robotics and automation (ICRA)
External Links
Snippet
We present a policy search method for learning complex feedback control policies that map from high-dimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search …
- 238000005070 sampling 0 abstract description 46
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
- G06N3/008—Artificial life, i.e. computers simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. robots replicating pets or humans in their appearance or behavior
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/39—Robotics, robotics to robotics hand
- G05B2219/39376—Hierarchical, learning, recognition and skill level and adaptation servo level
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chebotar et al. | Path integral guided policy search | |
Levine et al. | End-to-end training of deep visuomotor policies | |
Finn et al. | Deep visual foresight for planning robot motion | |
Rahmatizadeh et al. | Vision-based multi-task manipulation for inexpensive robots using end-to-end learning from demonstration | |
Finn et al. | Deep spatial autoencoders for visuomotor learning | |
US10717191B2 (en) | Apparatus and methods for haptic training of robots | |
Rozo et al. | Learning physical collaborative robot behaviors from human demonstrations | |
Antotsiou et al. | Task-oriented hand motion retargeting for dexterous manipulation imitation | |
Billard et al. | Discriminative and adaptive imitation in uni-manual and bi-manual tasks | |
Finn et al. | Learning visual feature spaces for robotic manipulation with deep spatial autoencoders | |
Rahmatizadeh et al. | From virtual demonstration to real-world manipulation using LSTM and MDN | |
Bischoff et al. | Policy search for learning robot control using sparse data | |
Yin et al. | Learning nonlinear dynamical system for movement primitives | |
Khadivar et al. | Adaptive fingers coordination for robust grasp and in-hand manipulation under disturbances and unknown dynamics | |
Xue et al. | Guided optimal control for long-term non-prehensile planar manipulation | |
Kim et al. | Learning and generalization of dynamic movement primitives by hierarchical deep reinforcement learning from demonstration | |
Yang et al. | Real-time motion adaptation using relative distance space representation | |
Liu et al. | A variable impedance skill learning algorithm based on kernelized movement primitives | |
El-Fakdi et al. | Policy gradient based reinforcement learning for real autonomous underwater cable tracking | |
Kubota et al. | Multiple fuzzy state-value functions for human evaluation through interactive trajectory planning of a partner robot | |
Malone et al. | Efficient motion-based task learning for a serial link manipulator | |
Luck et al. | Extracting bimanual synergies with reinforcement learning | |
Ting et al. | Locally Weighted Regression for Control. | |
Antonelo et al. | Modeling multiple autonomous robot behaviors and behavior switching with a single reservoir computing network | |
Mosbach et al. | Learning generalizable tool use with non-rigid grasp-pose registration |