default search action
Pierre Ménard
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c26]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Demonstration-Regularized RL. ICLR 2024 - [i34]Pierre Perrault, Denis Belomestny, Pierre Ménard, Éric Moulines, Alexey Naumov, Daniil Tiapkin, Michal Valko:
A New Bound on the Cumulant Generating Function of Dirichlet Processes. CoRR abs/2409.18621 (2024) - 2023
- [c25]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. ICML 2023: 10093-10135 - [c24]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c23]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. ICML 2023: 34161-34221 - [c22]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Model-free Posterior Sampling via Learning Rate Randomization. NeurIPS 2023 - [i33]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. CoRR abs/2303.08059 (2023) - [i32]Mariana Vargas Vieyra, Pierre Ménard:
Learning Generative Models with Goal-conditioned Reinforcement Learning. CoRR abs/2303.14811 (2023) - [i31]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i30]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Local and adaptive mirror descents in extensive-form games. CoRR abs/2309.00656 (2023) - [i29]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Demonstration-Regularized RL. CoRR abs/2310.17303 (2023) - [i28]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Model-free Posterior Sampling via Learning Rate Randomization. CoRR abs/2310.18186 (2023) - 2022
- [j2]Aurélien Garivier, Hédi Hadiji, Pierre Ménard, Gilles Stoltz:
KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints. J. Mach. Learn. Res. 23: 179:1-179:66 (2022) - [c21]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. AISTATS 2022: 7349-7383 - [c20]Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Ménard:
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses. ICML 2022: 21380-21431 - [c19]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. NeurIPS 2022 - [i27]Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Ménard:
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses. CoRR abs/2205.07704 (2022) - [i26]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i25]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. CoRR abs/2209.14414 (2022) - [i24]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. CoRR abs/2212.12567 (2022) - 2021
- [c18]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces. AISTATS 2021: 3538-3546 - [c17]Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited. ALT 2021: 578-598 - [c16]Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko:
Adaptive Reward-Free Exploration. ALT 2021: 865-891 - [c15]James Cheshire, Pierre Ménard, Alexandra Carpentier:
Problem Dependent View on Structured Thresholding Bandit Problems. ICML 2021: 1846-1854 - [c14]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
Kernel-Based Reinforcement Learning: A Finite-Time Analysis. ICML 2021: 2783-2792 - [c13]Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko:
Fast active learning for pure exploration in reinforcement learning. ICML 2021: 7599-7608 - [c12]Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko:
UCB Momentum Q-learning: Correcting the bias without forgetting. ICML 2021: 7609-7618 - [c11]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Indexed Minimum Empirical Divergence for Unimodal Bandits. NeurIPS 2021: 7346-7356 - [c10]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Learning in two-player zero-sum partially observable Markov games with perfect recall. NeurIPS 2021: 11987-11998 - [c9]Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier:
Bandits with many optimal arms. NeurIPS 2021: 22457-22469 - [i23]Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko:
UCB Momentum Q-learning: Correcting the bias without forgetting. CoRR abs/2103.01312 (2021) - [i22]Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier:
Bandits with many optimal arms. CoRR abs/2103.12452 (2021) - [i21]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall. CoRR abs/2106.06279 (2021) - [i20]James Cheshire, Pierre Ménard, Alexandra Carpentier:
Problem Dependent View on Structured Thresholding Bandit Problems. CoRR abs/2106.10166 (2021) - [i19]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. CoRR abs/2111.12045 (2021) - [i18]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Indexed Minimum Empirical Divergence for Unimodal Bandits. CoRR abs/2112.01452 (2021) - 2020
- [c8]Xuedong Shang, Rianne de Heide, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Fixed-confidence guarantees for Bayesian best-arm identification. AISTATS 2020: 1823-1832 - [c7]Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko:
A single algorithm for both restless and rested rotting bandits. AISTATS 2020: 3784-3794 - [c6]James Cheshire, Pierre Ménard, Alexandra Carpentier:
The Influence of Shape Constraints on the Thresholding Bandit Problem. COLT 2020: 1228-1275 - [c5]Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko:
Gamification of Pure Exploration for Linear Bandits. ICML 2020: 2432-2442 - [c4]Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko:
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity. NeurIPS 2020 - [i17]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
Regret Bounds for Kernel-Based Reinforcement Learning. CoRR abs/2004.05599 (2020) - [i16]Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko:
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity. CoRR abs/2006.05879 (2020) - [i15]Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko:
Adaptive Reward-Free Exploration. CoRR abs/2006.06294 (2020) - [i14]James Cheshire, Pierre Ménard, Alexandra Carpentier:
The Influence of Shape Constraints on the Thresholding Bandit Problem. CoRR abs/2006.10006 (2020) - [i13]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Forced-exploration free Strategies for Unimodal Bandits. CoRR abs/2006.16569 (2020) - [i12]Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko:
Gamification of Pure Exploration for Linear Bandits. CoRR abs/2007.00953 (2020) - [i11]Hassan Saber, Pierre Ménard, Odalric-Ambrym Maillard:
Optimal Strategies for Graph-Structured Bandits. CoRR abs/2007.03224 (2020) - [i10]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces. CoRR abs/2007.05078 (2020) - [i9]Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko:
Fast active learning for pure exploration in reinforcement learning. CoRR abs/2007.13442 (2020) - [i8]Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited. CoRR abs/2010.03531 (2020)
2010 – 2019
- 2019
- [j1]Aurélien Garivier, Pierre Ménard, Gilles Stoltz:
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems. Math. Oper. Res. 44(2): 377-399 (2019) - [c3]Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Ménard, Rémi Munos, Michal Valko:
Planning in entropy-regularized Markov decision processes and games. NeurIPS 2019: 12383-12392 - [c2]Rémy Degenne, Wouter M. Koolen, Pierre Ménard:
Non-Asymptotic Pure Exploration by Solving Games. NeurIPS 2019: 14465-14474 - [i7]Pierre Ménard:
Gradient Ascent for Active Exploration in Bandit Problems. CoRR abs/1905.08165 (2019) - [i6]Rémy Degenne, Wouter M. Koolen, Pierre Ménard:
Non-Asymptotic Pure Exploration by Solving Games. CoRR abs/1906.10431 (2019) - [i5]Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre Ménard, Michal Valko:
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification. CoRR abs/1910.10945 (2019) - 2018
- [i4]Aurélien Garivier, Hédi Hadiji, Pierre Ménard, Gilles Stoltz:
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints. CoRR abs/1805.05071 (2018) - 2017
- [c1]Pierre Ménard, Aurélien Garivier:
A minimax and asymptotically optimal algorithm for stochastic bandits. ALT 2017: 223-237 - [i3]Sébastien Gerchinovitz, Pierre Ménard, Gilles Stoltz:
Fano's inequality for random variables. CoRR abs/1702.05985 (2017) - [i2]Pierre Ménard, Aurélien Garivier:
A minimax and asymptotically optimal algorithm for stochastic bandits. CoRR abs/1702.07211 (2017) - 2016
- [i1]Aurélien Garivier, Pierre Ménard, Gilles Stoltz:
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems. CoRR abs/1602.07182 (2016)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:15 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint