Computer Science ›› 2021, Vol. 48 ›› Issue (9): 257-263.doi: 10.11896/jsjkx.200700044
• Artificial Intelligence • Previous Articles Next Articles
WU Shao-bo1,2,3, FU Qi-ming1,2,3, CHEN Jian-ping2,3, WU Hong-jie1,2, LU You1,2
CLC Number:
[1]SUTTON R S,BARTO A G.Reinforcement learning:An introduction[M].MIT Press,2018. [2]NG A Y,RUSSELL S J.Algorithms for inverse reinforcement learning[C]//Proceedings of the International Conference on Machine Learning.California,USA,2000:663-670. [3]ABBEEL P,NG A Y.Apprenticeship learning via inverse reinforcement learning[C]//Proceedings of the International Conference on Machine Learning.Banff,Canada,2004:1. [4]RATLIFF N D,SILVER D,BAGNELL J A.Learning tosearch:Functional gradient techniques for imitation learning[J].Autonomous Robots,2009,27(1):25-53. [5]ZIEBART B D,MAAS A L,BAGNELL J A,et al.Maximum Entropy Inverse Reinforcement Learning[C]//Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence(AAAI 2008).Chicago,Illinois,USA,2008:13-17. [6]BOULARIAS A,KOBER J.Relative Entropy Inverse Rein-forcement Learning[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS) 2011.Fort Lauderdale,FL,USA,2011. [7]WANG Y X,HEBERT M.Learning to learn:Model regression networks for easy small sample learning[C]//European Confe-rence on Computer Vision.Springer,Cham,2016:616-634. [8]FINN C,ABBEEL P,LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning.2017:1126-1135. [9]SNELL J,SWERSKY K,ZEMEL R.Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems.2017:4077-4087. [10]MISHRA N,ROHANINEJAD M,CHEN X,et al.Meta-lear-ning with temporal convolutions[J].arXiv:1707.03141. [11]ANDRYCHOWICZ M,DENIL M,COLMENAREJO S G,et al.Learning to learn by gradient descent[C]//30th Conference on Neural Information Processing Systems (NIPS 2016).Barce-lona,Spain.2016. [12]CHEN X L,CAO L,HE M,et al.A Summary of Research onDeep Reverse Reinforcement Learning[J].Computer Enginee-ring and Applications,2018,54(5):24-35. [13]XIA C,KAMEL A E.Neural inverse reinforcement learning in autonomous navigation[J].Robotics & Autonomous Systems,2016,84:1-14. [14]YI Z,ZHANG H,TAN P,et al.Dualgan:Unsupervised duallearning for image-to-image translation[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy,2017:2849-2857. [15]BYRAVAN A,MONFORT M,ZIEBART B,et al.Graph-based inverse optimal control for robot manipulation[C]//Proceedings of the Association for the Advance of Artificial Intelligence.Austin,USA,2015:1874-1890. |
[1] | QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24. |
[2] | ZHOU Ying, CHANG Ming-xin, YE Hong, ZHANG Yan. Super Resolution Reconstruction Method of Solar Panel Defect Images Based on Meta-transfer [J]. Computer Science, 2022, 49(3): 185-191. |
[3] | HUANG Xin-quan, LIU Ai-jun, LIANG Xiao-hu, WANG Heng. Load-balanced Geographic Routing Protocol in Aerial Sensor Network [J]. Computer Science, 2022, 49(2): 342-352. |
[4] | WANG Wei-dong, XU Jin-hui, ZHANG Zhi-feng, YANG Xi-bei. Gaussian Mixture Models Algorithm Based on Density Peaks Clustering [J]. Computer Science, 2021, 48(10): 191-196. |
[5] | YU Cheng, ZHU Wan-ning, YOU Kun, ZHU Jin-fu. Prediction Model of E-sports Behavior Pattern Based on Attention Mechanism and LRUA Module [J]. Computer Science, 2019, 46(11A): 76-79. |
[6] | CEHN Jun-hua, BIAN Zhai-an, LI Hui-jia, GUAN Run-dan. Measuring Method of Node Influence Based on Relative Entropy [J]. Computer Science, 2018, 45(11A): 292-298. |
[7] | ZHOU Xian-ting, HUANG Wen-ming and DENG Zhen-rong. Micro-blog Retweet Behavior Prediction Algorithm Based on Anomaly Detection and Random Forest [J]. Computer Science, 2017, 44(7): 191-196. |
[8] | . Relative Entropy Threshold Segmentation Method Based on the Minimum Variance Filtering [J]. Computer Science, 2012, 39(7): 253-256. |
[9] | . [J]. Computer Science, 2006, 33(5): 222-226. |
[10] | . [J]. Computer Science, 2005, 32(10): 181-186. |
|