Abstract
We present a novel sparsification and value function approximation method for on-line reinforcement learning in continuous state and action spaces. Our approach is based on the kernel least squares temporal difference learning algorithm. We derive a recursive version and enhance the algorithm with a new sparsification mechanism based on the topology maps represented by proximity graphs. The sparsification mechanism – speeding up computations – favors data-points minimizing the divergence of the target-function gradient, thereby also considering the shape of the target function. The performance of our sparsification and approximation method is tested on a standard benchmark RL problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boyan, J.A.: Technical update: Least-squares temporal difference learning. Machine Learning 49(2-3), 233–246 (2002)
Bradtke, S.J., Barto, A.G., Kaelbling, P.: Linear least-squares algorithms for temporal difference learning. In: Machine Learning, pp. 22–33 (1996)
Csató, L., Opper, M.: Sparse On-Line Gaussian Processes. In: Neural Computation, vol. 14(3), pp. 641–668 (2002)
Engel, Y., Mannor, S., Meir, R.: The kernel recursive least squares algorithm. IEEE Transactions on Signal Processing 52, 2275–2285 (2003)
Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 220–227. Springer, Heidelberg (2004)
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Ruggeri, M.R., Saupe, D.: Isometry-invariant matching of point set surfaces. In: Eurographics Workshop on 3D Object Retrieval (2008)
Szepesvári, C.: Algorithms for Reinforcement Learning. Morgan & Claypool (2011)
Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1017–1024. ACM, New York (2009)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4) (2007)
Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks, 973–992 (2007)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jakab, H.S., Csató, L. (2013). Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning. In: Mladenov, V., Koprinkova-Hristova, P., Palm, G., Villa, A.E.P., Appollini, B., Kasabov, N. (eds) Artificial Neural Networks and Machine Learning – ICANN 2013. ICANN 2013. Lecture Notes in Computer Science, vol 8131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40728-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-40728-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40727-7
Online ISBN: 978-3-642-40728-4
eBook Packages: Computer ScienceComputer Science (R0)