default search action
Gerald Tesauro
Person information
- affiliation: IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [i28]Tyler Malloy, Miao Liu, Matthew D. Riemer, Tim Klinger, Gerald Tesauro, Chris R. Sims:
Learning in Factored Domains with Information-Constrained Visual Representations. CoRR abs/2303.17508 (2023) - 2022
- [c68]Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How:
Context-Specific Representation Abstraction for Deep Option Learning. AAAI 2022: 5959-5967 - [c67]Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How:
Influencing Long-Term Behavior in Multiagent Reinforcement Learning. NeurIPS 2022 - [i27]Junkyu Lee, Michael Katz, Don Joven Agravante, Miao Liu, Tim Klinger, Murray Campbell, Shirin Sohrabi, Gerald Tesauro:
AI Planning Annotation for Sample Efficient Reinforcement Learning. CoRR abs/2203.00669 (2022) - [i26]Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How:
Influencing Long-Term Behavior in Multiagent Reinforcement Learning. CoRR abs/2203.03535 (2022) - [i25]Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How:
Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution Concept over Nash Equilibria. CoRR abs/2210.16175 (2022) - 2021
- [c66]Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell:
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines. AAAI 2021: 9018-9027 - [c65]Tyler Malloy, Tim Klinger, Miao Liu, Gerald Tesauro, Matthew Riemer, Chris R. Sims:
RL Generalization in a Theory of Mind Game Through a Sleep Metaphor (Student Abstract). AAAI 2021: 15841-15842 - [c64]Tyler Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro:
Capacity-Limited Decentralized Actor-Critic for Multi-Agent Games. CoG 2021: 1-8 - [c63]Tyler Malloy, Tim Klinger, Miao Liu, Gerald Tesauro, Matthew Riemer, Chris R. Sims:
Modeling Capacity-Limited Decision Making Using a Variational Autoencoder. CogSci 2021 - [c62]Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How:
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning. ICML 2021: 5541-5550 - [c61]Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro:
Efficient Black-Box Planning Using Macro-Actions with Focused Effects. IJCAI 2021: 4024-4031 - [i24]Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How:
Context-Specific Representation Abstraction for Deep Option Learning. CoRR abs/2109.09876 (2021) - 2020
- [c60]Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald Tesauro:
On the Role of Weight Sharing During Deep Option Learning. AAAI 2020: 5519-5526 - [c59]Dong-Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot, Matthew Riemer, Golnaz Habibi, Gerald Tesauro, Sami Mourad, Murray Campbell, Jonathan P. How:
Learning Hierarchical Teaching Policies for Cooperative Agents. AAMAS 2020: 620-628 - [c58]Gang Wang, Songtao Lu, Georgios B. Giannakis, Gerald Tesauro, Jian Sun:
Decentralized TD Tracking with Linear Function Approximation and its Finite-Time Analysis. NeurIPS 2020 - [i23]Cameron Allen, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro:
Finding Macro-Actions with Disentangled Effects for Efficient Planning with the Goal-Count Heuristic. CoRR abs/2004.13242 (2020) - [i22]Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell:
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines. CoRR abs/2010.03790 (2020) - [i21]Tyler Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro:
Deep RL With Information Constrained Policies: Generalization in Continuous Control. CoRR abs/2010.04646 (2020) - [i20]Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How:
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning. CoRR abs/2011.00382 (2020) - [i19]Tyler Malloy, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro, Chris R. Sims:
Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games. CoRR abs/2011.11517 (2020)
2010 – 2019
- 2019
- [c57]Xiaoxiao Guo, Shiyu Chang, Mo Yu, Gerald Tesauro, Murray Campbell:
Hybrid Reinforcement Learning with Expert State Sequences. AAAI 2019: 3739-3746 - [c56]Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How:
Learning to Teach in Cooperative Multiagent Reinforcement Learning. AAAI 2019: 6128-6136 - [c55]Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro:
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference. ICLR (Poster) 2019 - [i18]Dong-Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot, Matthew Riemer, Golnaz Habibi, Gerald Tesauro, Sami Mourad, Murray Campbell, Jonathan P. How:
Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning. CoRR abs/1903.03216 (2019) - [i17]Xiaoxiao Guo, Shiyu Chang, Mo Yu, Gerald Tesauro, Murray Campbell:
Hybrid Reinforcement Learning with Expert State Sequences. CoRR abs/1903.04110 (2019) - [i16]Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald Tesauro:
On the Role of Weight Sharing During Deep Option Learning. CoRR abs/1912.13408 (2019) - 2018
- [j23]Ron Sun, David Silver, Gerald Tesauro, Guang-Bin Huang:
Introduction to the special issue on deep reinforcement learning: An editorial. Neural Networks 107: 1-2 (2018) - [c54]Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerry Tesauro, Bowen Zhou, Jing Jiang:
R3: Reinforced Ranker-Reader for Open-Domain Question Answering. AAAI 2018: 5981-5988 - [c53]Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell:
Eigenoption Discovery through the Deep Successor Representation. ICLR (Poster) 2018 - [c52]Shuohang Wang, Mo Yu, Jing Jiang, Wei Zhang, Xiaoxiao Guo, Shiyu Chang, Zhiguo Wang, Tim Klinger, Gerald Tesauro, Murray Campbell:
Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering. ICLR (Poster) 2018 - [c51]Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Yu Cheng, Gerald Tesauro, Haoyu Wang, Bowen Zhou:
Diverse Few-Shot Text Classification with Multiple Metrics. NAACL-HLT 2018: 1206-1215 - [c50]Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogério Schmidt Feris:
Dialog-based Interactive Image Retrieval. NeurIPS 2018: 676-686 - [c49]Matthew Riemer, Miao Liu, Gerald Tesauro:
Learning Abstract Options. NeurIPS 2018: 10445-10455 - [i15]Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Yu Cheng, Gerald Tesauro, Haoyu Wang, Bowen Zhou:
Diverse Few-Shot Text Classification with Multiple Metrics. CoRR abs/1805.07513 (2018) - [i14]Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How:
Learning to Teach in Cooperative Multiagent Reinforcement Learning. CoRR abs/1805.07830 (2018) - [i13]Matthew Riemer, Miao Liu, Gerald Tesauro:
Learning Abstract Options. CoRR abs/1810.11583 (2018) - [i12]Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro:
Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference. CoRR abs/1810.11910 (2018) - 2017
- [j22]Mohan Sridharan, Gerald Tesauro, James A. Hendler:
Cognitive Computing. IEEE Intell. Syst. 32(4): 3-4 (2017) - [c48]Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bowen Zhou, Yoshua Bengio, Aaron C. Courville:
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation. AAAI 2017: 3288-3294 - [c47]Ruben Rodriguez Torrado, Jesus Rios, Gerald Tesauro:
Optimal Sequential Drilling for Hydrocarbon Field Development Planning. AAAI 2017: 4734-4739 - [c46]Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerry Tesauro, Satinder Singh:
Learning to Query, Reason, and Answer Questions On Ambiguous Texts. ICLR (Poster) 2017 - [i11]Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Gerald Tesauro, Haoyu Wang, Bowen Zhou:
Robust Task Clustering for Deep Many-Task Learning. CoRR abs/1708.07918 (2017) - [i10]Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, Jing Jiang:
R3: Reinforced Reader-Ranker for Open-Domain Question Answering. CoRR abs/1709.00023 (2017) - [i9]Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell:
Eigenoption Discovery through the Deep Successor Representation. CoRR abs/1710.11089 (2017) - [i8]Shuohang Wang, Mo Yu, Jing Jiang, Wei Zhang, Xiaoxiao Guo, Shiyu Chang, Zhiguo Wang, Tim Klinger, Gerald Tesauro, Murray Campbell:
Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering. CoRR abs/1711.05116 (2017) - [i7]Miao Liu, Marlos C. Machado, Gerald Tesauro, Murray Campbell:
The Eigenoption-Critic Framework. CoRR abs/1712.04065 (2017) - 2016
- [c45]Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro:
Selecting Near-Optimal Learners via Incremental Data Allocation. AAAI 2016: 2007-2015 - [i6]Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro:
Selecting Near-Optimal Learners via Incremental Data Allocation. CoRR abs/1601.00024 (2016) - [i5]Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, Yoshua Bengio:
Hierarchical Memory Networks. CoRR abs/1605.07427 (2016) - [i4]Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bowen Zhou, Yoshua Bengio, Aaron C. Courville:
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation. CoRR abs/1606.00776 (2016) - 2015
- [j21]Stefano V. Albrecht, André da Motta Salles Barreto, Darius Braziunas, David L. Buckeridge, Heriberto Cuayáhuitl, Nina Dethlefs, Markus Endres, Amir-massoud Farahmand, Mark Fox, Lutz Frommberger, Sam Ganzfried, Yolanda Gil, Sébastien Guillet, Lawrence E. Hunter, Arnav Jhala, Kristian Kersting, George Dimitri Konidaris, Freddy Lécué, Sheila A. McIlraith, Sriraam Natarajan, Zeinab Noorian, David Poole, Rémi Ronfard, Alessandro Saffiotti, Arash Shaban-Nejad, Biplav Srivastava, Gerald Tesauro, Rosario Uceda-Sosa, Guy Van den Broeck, Martijn van Otterlo, Byron C. Wallace, Paul Weng, Jenna Wiens, Jie Zhang:
Reports of the AAAI 2014 Conference Workshops. AI Mag. 36(1): 87-98 (2015) - [c44]Kareem Amin, Satyen Kale, Gerald Tesauro, Deepak S. Turaga:
Budgeted Prediction with Expert Advice. AAAI 2015: 2490-2496 - [c43]Alain Biem, Maria Butrico, Mark Feblowitz, Tim Klinger, Yuri Malitsky, Kenney Ng, Adam Perer, Chandra Reddy, Anton Riabov, Horst Samulowitz, Daby M. Sow, Gerald Tesauro, Deepak S. Turaga:
Towards Cognitive Automation of Data Science. AAAI 2015: 4268-4269 - 2014
- [i3]Gerald Tesauro, David Gondek, Jonathan Lenchner, James Fan, John M. Prager:
Analysis of Watson's Strategies for Playing Jeopardy! CoRR abs/1402.0571 (2014) - 2013
- [j20]Gerry Tesauro, David Gondek, Jonathan Lenchner, James Fan, John M. Prager:
Analysis of Watson's Strategies for Playing Jeopardy! J. Artif. Intell. Res. 47: 205-251 (2013) - 2012
- [j19]Gerry Tesauro, David Gondek, Jon Lenchner, James Fan, John M. Prager:
Simulation, learning, and optimization techniques in Watson's game strategies. IBM J. Res. Dev. 56(3): 16 (2012) - [c42]Janusz Marecki, Gerald Tesauro, Richard B. Segal:
Playing repeated Stackelberg games with unknown opponents. AAMAS 2012: 821-828 - [c41]Joseph P. Bigus, Ching-Hua Chen-Ritzo, Keith Hermiz, Gerald Tesauro, Robert Sorrentino:
Applying a framework for healthcare incentives simulation. WSC 2012: 80:1-80:12 - [i2]Gerald Tesauro, V. T. Rajan, Richard B. Segal:
Bayesian Inference in Monte-Carlo Tree Search. CoRR abs/1203.3519 (2012) - [i1]Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, William E. Walsh:
Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation. CoRR abs/1212.2443 (2012) - 2010
- [c40]Gerald Tesauro, V. T. Rajan, Richard B. Segal:
Bayesian Inference in Monte-Carlo Tree Search. UAI 2010: 580-588
2000 – 2009
- 2009
- [c39]David Silver, Gerald Tesauro:
Monte-Carlo simulation balancing. ICML 2009: 945-952 - 2008
- [c38]Rajarshi Das, Jeffrey O. Kephart, Charles Lefurgy, Gerald Tesauro, David W. Levine, Hoi Y. Chan:
Autonomic multi-agent management of power and performance in data centers. AAMAS (Industry Track) 2008: 107-114 - [c37]Irina Rish, Gerald Tesauro:
Active Collaborative Prediction with Maximum Margin Matrix Factorization. ISAIM 2008 - 2007
- [j18]Gerald Tesauro, Nicholas K. Jong, Rajarshi Das, Mohamed N. Bennani:
On the use of hybrid reinforcement learning for autonomic resource allocation. Clust. Comput. 10(3): 287-299 (2007) - [j17]Gerald Tesauro:
Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies. IEEE Internet Comput. 11(1): 22-30 (2007) - [c36]Jeffrey O. Kephart, Hoi Y. Chan, Rajarshi Das, David W. Levine, Gerald Tesauro, Freeman L. Rawson III, Charles Lefurgy:
Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs. ICAC 2007: 24 - [c35]Irina Rish, Gerald Tesauro:
Estimating End-to-End Performance by Collaborative Prediction with Active Sampling. Integrated Network Management 2007: 294-303 - [c34]Gerald Tesauro, Rajarshi Das, Hoi Y. Chan, Jeffrey O. Kephart, David W. Levine, Freeman L. Rawson III, Charles Lefurgy:
Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning. NIPS 2007: 1497-1504 - [c33]Kilian Q. Weinberger, Gerald Tesauro:
Metric Learning for Kernel Regression. AISTATS 2007: 612-619 - 2006
- [c32]Gerald Tesauro, Nicholas K. Jong, Rajarshi Das, Mohamed N. Bennani:
Improvement of Systems Management Policies Using Hybrid Reinforcement Learning. ECML 2006: 783-791 - [c31]Gerald Tesauro, Nicholas K. Jong, Rajarshi Das, Mohamed N. Bennani:
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation. ICAC 2006: 65-73 - 2005
- [c30]Relu Patrascu, Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, William E. Walsh:
New Approaches to Optimization and Utility Elicitation in Autonomic Computing. AAAI 2005: 140-145 - [c29]Gerald Tesauro:
Online Resource Allocation Using Decompositional Reinforcement Learning. AAAI 2005: 886-891 - [c28]Gerald Tesauro, Rajarshi Das, William E. Walsh, Jeffrey O. Kephart:
Utility-Function-Driven Resource Allocation in Autonomic Systems. ICAC 2005: 342-343 - 2004
- [c27]Gerald Tesauro, David M. Chess, William E. Walsh, Rajarshi Das, Alla Segal, Ian Whalley, Jeffrey O. Kephart, Steve R. White:
A Multi-Agent Systems Approach to Autonomic Computing. AAMAS 2004: 464-471 - [c26]William E. Walsh, Gerald Tesauro, Jeffrey O. Kephart, Rajarshi Das:
Utility Functions in Autonomic Systems. ICAC 2004: 70-77 - 2003
- [c25]Gerald Tesauro:
Extending Q-Learning to General Adaptive Multi-Agent Systems. NIPS 2003: 871-878 - [c24]Cuihong Li, Gerald Tesauro:
A strategic decision model for multi-attribute bilateral negotiation with alternating. EC 2003: 208-209 - [c23]James E. Hanson, Gerald Tesauro, Jeffrey O. Kephart, E. C. Snibl:
Multi-agent implementation of asymmetric protocol for bilateral negotiations. EC 2003: 224-225 - [c22]Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, William E. Walsh:
Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation. UAI 2003: 89-97 - 2002
- [j16]Gerald Tesauro, Jeffrey O. Kephart:
Pricing in Agent Economies Using Multi-Agent Q-Learning. Auton. Agents Multi Agent Syst. 5(3): 289-304 (2002) - [j15]Gerald Tesauro:
Programming backgammon using self-teaching neural nets. Artif. Intell. 134(1-2): 181-199 (2002) - [c21]Gerald Tesauro, Jonathan Bredin:
Strategic sequential bidding in auctions using dynamic programming. AAMAS 2002: 591-598 - 2001
- [c20]Rajarshi Das, James E. Hanson, Jeffrey O. Kephart, Gerald Tesauro:
Agent-Human Interactions in the Continuous Double Auction. IJCAI 2001: 1169-1187 - [c19]Gerald Tesauro:
Pricing in Agent Economies Using Neural Networks and Multi-agent Q-Learning. Sequence Learning 2001: 288-307 - [c18]Gerald Tesauro, Rajarshi Das:
High-performance bidding agents for the continuous double auction. EC 2001: 206-209 - 2000
- [j14]Gerald Tesauro, Jeffrey O. Kephart:
Foresight-based pricing algorithms in agent economies. Decis. Support Syst. 28(1-2): 49-60 (2000) - [c17]Manu Sridharan, Gerald Tesauro:
Multi-Agent Q-Learning and Regression Trees for Automated Pricing Decisions. ICMAS 2000: 447-448 - [c16]Jeffrey O. Kephart, Gerald Tesauro:
Pseudo-convergent Q-Learning by Competitive Pricebots. ICML 2000: 463-470 - [c15]Manu Sridharan, Gerald Tesauro:
Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions. ICML 2000: 927-934
1990 – 1999
- 1999
- [c14]Amy Greenwald, Jeffrey O. Kephart, Gerald Tesauro:
Strategic pricebot dynamics. EC 1999: 58-67 - 1998
- [j13]Gerald Tesauro:
Comments on "Co-Evolution in the Successful Learning of Backgammon Strategy". Mach. Learn. 32(3): 241-243 (1998) - [c13]Gerald Tesauro, Jeffrey O. Kephart:
Foresight-based pricing algorithms in an economy of software agents. ICE 1998: 37-44 - 1996
- [c12]Gerald Tesauro, Gregory R. Galperin:
On-line Policy Improvement using Monte-Carlo Search. NIPS 1996: 1068-1074 - 1995
- [j12]Gerald Tesauro:
Temporal Difference Learning and TD-Gammon. Commun. ACM 38(3): 58-68 (1995) - [j11]Gerald Tesauro:
Temporal Difference Learning and TD-Gammon. J. Int. Comput. Games Assoc. 18(2): 88 (1995) - [c11]Jeffrey O. Kephart, Gregory B. Sorkin, William C. Arnold, David M. Chess, Gerald Tesauro, Steve R. White:
Biologically Inspired Defenses Against Computer Viruses. IJCAI (1) 1995: 985-996 - [e2]Gerald Tesauro, David S. Touretzky, Todd K. Leen:
Advances in Neural Information Processing Systems 7, [NIPS Conference, Denver, Colorado, USA, 1994]. MIT Press 1995 [contents] - 1994
- [j10]Gerald Tesauro:
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Comput. 6(2): 215-219 (1994) - [e1]Jack D. Cowan, Gerald Tesauro, Joshua Alspector:
Advances in Neural Information Processing Systems 6, [7th NIPS Conference, Denver, Colorado, USA, 1993]. Morgan Kaufmann 1994, ISBN 1-55860-322-0 [contents] - 1992
- [j9]Gerald Tesauro:
Practical Issues in Temporal Difference Learning. Mach. Learn. 8: 257-277 (1992) - [j8]David A. Cohn, Gerald Tesauro:
How Tight Are the Vapnik-Chervonenkis Bounds? Neural Comput. 4(2): 249-269 (1992) - [c10]Gerald Tesauro:
Temporal Difference Learning of Backgammon Strategy. ML 1992: 451-457 - 1991
- [j7]Jakub Wejchert, Gerald Tesauro:
Visualizing processes in neural networks. IBM J. Res. Dev. 35(1): 244-253 (1991) - [c9]Gerald Tesauro:
Practical Issues in Temporal Difference Learning. NIPS 1991: 259-266 - 1990
- [c8]Gerald Tesauro:
Neurogammon: a neural-network backgammon program. IJCNN 1990: 33-39 - [c7]David A. Cohn, Gerald Tesauro:
Can Neural Networks Do Better Than the Vapnik-Chervonenkis Bounds? NIPS 1990: 911-917
1980 – 1989
- 1989
- [j6]Gerald Tesauro, Terrence J. Sejnowski:
A Parallel Network that Learns to Play Backgammon. Artif. Intell. 39(3): 357-390 (1989) - [j5]Gerald Tesauro:
Neurogammon Wins Computer Olympiad. Neural Comput. 1(3): 321-323 (1989) - [j4]Gerald Tesauro, Yu He, Subutai Ahmad:
Asymptotic Convergence of Backpropagation. Neural Comput. 1(3): 382-391 (1989) - [c6]Jakub Wejchert, Gerald Tesauro:
Neural Network Visualization. NIPS 1989: 465-472 - [c5]Subutai Ahmad, Gerald Tesauro, Yu He:
Asymptotic Convergence of Backpropagation: Numerical Experiments. NIPS 1989: 606-613 - 1988
- [j3]Gerald Tesauro, Bob Janssens:
Scaling Relationships in Back-propagation Learning. Complex Syst. 2(1) (1988) - [j2]Subutai Ahmad, Gerald Tesauro:
A study of scaling and generalization in neural networks. Neural Networks 1(Supplement-1): 3-6 (1988) - [c4]Gerald Tesauro:
Connectionist Learning of Expert Backgammon Evaluations. ML 1988: 200-206 - [c3]Gerald Tesauro:
Connectionist Learning of Expert Preferences by Comparison Training. NIPS 1988: 99-106 - [c2]Subutai Ahmad, Gerald Tesauro:
Scaling and Generalization in Neural Networks: A Case Study. NIPS 1988: 160-168 - 1987
- [j1]Gerald Tesauro:
Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size. Complex Syst. 1(2) (1987) - [c1]Gerald Tesauro, Terrence J. Sejnowski:
A 'Neural' Network that Learns to Play Backgammon. NIPS 1987: 794-803
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-18 20:46 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint