default search action

combined dblp search
author search
venue search
publication search

ask others

Zhaohan Guo

Zhaohan Daniel Guo

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c16]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/AzarGPMRVC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/AzarGPMRVC24
Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Rémi Munos, Mark Rowland, Michal Valko, Daniele Calandriello:
A General Theoretical Paradigm to Understand Learning from Human Preferences. AISTATS 2024: 4447-4455
[c15]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/CalandrielloGMR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/CalandrielloGMR24
Daniele Calandriello, Zhaohan Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. ICML 2024
[c14]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/MunosVCARGTGMFM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosVCARGTGMFM24
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. ICML 2024
[c13]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangGZCMRRVPP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGZCMRRVPP24
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. ICML 2024
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05749
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05749
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. CoRR abs/2402.05749 (2024)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-08448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-08448
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Yuan Cao, Eugene Tarassov, Rémi Munos, Bernardo Ávila Pires, Michal Valko, Yong Cheng, Will Dabney:
Understanding the performance gap between online and offline alignment algorithms. CoRR abs/2405.08448 (2024)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02035
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02035
Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Ávila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana Borsa, Arthur Guez, Will Dabney:
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning. CoRR abs/2406.02035 (2024)
2023
[c12]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ChandakTGTMDB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ChandakTGTMDB23
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. ICML 2023: 4009-4034
[c11]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangGRPCMRALL0T23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGRPCMRALL0T23
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-00654
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-00654
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. CoRR abs/2305.00654 (2023)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00886
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. CoRR abs/2312.00886 (2023)
2022
[c10]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/GuoTPPATSCGTVMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTPPATSCGTVMA22
Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08332
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08332
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03319
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03319
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022)
2021
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2101-02055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-02055
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021)
2020
[c9]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/BadiaSVGPKTAPBB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/BadiaSVGPKTAPBB20
Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andrew Bolt, Charles Blundell:
Never Give Up: Learning Directed Exploration Strategies. ICLR 2020
[c8]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/BadiaPKSVGB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BadiaPKSVGB20
Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Charles Blundell:
Agent57: Outperforming the Atari Human Benchmark. ICML 2020: 507-517
[c7]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GuoPPGAMA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuoPPGAMA20
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886
[c6]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/GrillSATRBDPGAP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillSATRBDPGAP20
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-06038
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-06038
Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andrew Bolt, Charles Blundell:
Never Give Up: Learning Directed Exploration Strategies. CoRR abs/2002.06038 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-13350
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-13350
Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Daniel Guo, Charles Blundell:
Agent57: Outperforming the Atari Human Benchmark. CoRR abs/2003.13350 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-14646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-14646
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07733
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07733
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1906-07805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1906-07805
Zhaohan Daniel Guo, Emma Brunskill:
Directed Exploration for Reinforcement Learning. CoRR abs/1906.07805 (2019)
2018
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-06407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-06407
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018)
2017
[c5]
- view
- export record
  dblp key:
  - conf/nips/GuoTB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTB17
Zhaohan Guo, Philip S. Thomas, Emma Brunskill:
Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation. NIPS 2017: 2492-2501
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GuoTB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GuoTB17
Zhaohan Daniel Guo, Philip S. Thomas, Emma Brunskill:
Using Options for Long-Horizon Off-Policy Evaluation. CoRR abs/1703.03453 (2017)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GuoB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GuoB17
Zhaohan Daniel Guo, Emma Brunskill:
Sample Efficient Feature Selection for Factored MDPs. CoRR abs/1703.03454 (2017)
2016
[c4]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/GuoDB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/GuoDB16
Zhaohan Daniel Guo, Shayan Doroudi, Emma Brunskill:
A PAC RL Algorithm for Episodic POMDPs. AISTATS 2016: 510-518
[c3]
- view
  - electronic edition @ acm.org
  - details & citations
- export record
  dblp key:
  - conf/atal/LiuGB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/LiuGB16
Yao Liu, Zhaohan Guo, Emma Brunskill:
PAC Continuous State Online Multitask Reinforcement Learning with Identification. AAMAS 2016: 438-446
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GuoDB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GuoDB16
Zhaohan Daniel Guo, Shayan Doroudi, Emma Brunskill:
A PAC RL Algorithm for Episodic POMDPs. CoRR abs/1605.08062 (2016)
2015
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/GuoB15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/GuoB15
Zhaohan Guo, Emma Brunskill:
Concurrent PAC RL. AAAI 2015: 2624-2630
2014
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuoTYZ14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuoTYZ14
Zhaohan Daniel Guo, Gökhan Tür, Wen-tau Yih, Geoffrey Zweig:
Joint semantic utterance classification and slot filling with recursive neural networks. SLT 2014: 554-559

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.