default search action
Yannis Flet-Berliac
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c10]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Bill Wu, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. EMNLP 2024: 21353-21370 - [c9]Raphaël Boige, Yannis Flet-Berliac, Lars C. P. M. Quaedvlieg, Arthur Flajolet, Guillaume Richard, Thomas Pierrot:
PASTA: Pretrained Action-State Transformer Agents. RLC 2024: 1511-1532 - [i13]Allen Nie, Yash Chandak, Christina J. Yuan, Anirudhan Badrinath, Yannis Flet-Berliac, Emma Brunskill:
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators. CoRR abs/2405.17708 (2024) - [i12]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i11]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - 2023
- [c8]Kefan Dong, Yannis Flet-Berliac, Allen Nie, Emma Brunskill:
Model-Based Offline Reinforcement Learning with Local Misspecification. AAAI 2023: 7423-7431 - [c7]Anirudhan Badrinath, Yannis Flet-Berliac, Allen Nie, Emma Brunskill:
Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets. NeurIPS 2023 - [i10]Kefan Dong, Yannis Flet-Berliac, Allen Nie, Emma Brunskill:
Model-based Offline Reinforcement Learning with Local Misspecification. CoRR abs/2301.11426 (2023) - [i9]Anirudhan Badrinath, Yannis Flet-Berliac, Allen Nie, Emma Brunskill:
Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets. CoRR abs/2306.14069 (2023) - [i8]Raphaël Boige, Yannis Flet-Berliac, Arthur Flajolet, Guillaume Richard, Thomas Pierrot:
PASTA: Pretrained Action-State Transformer Agents. CoRR abs/2307.10936 (2023) - 2022
- [c6]Allen Nie, Yannis Flet-Berliac, Deon R. Jordan, William Steenbergen, Emma Brunskill:
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data. NeurIPS 2022 - [c5]Yao Liu, Yannis Flet-Berliac, Emma Brunskill:
Offline policy optimization with eligible actions. UAI 2022: 1253-1263 - [i7]Yannis Flet-Berliac, Debabrota Basu:
SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics. CoRR abs/2204.09424 (2022) - [i6]Yao Liu, Yannis Flet-Berliac, Emma Brunskill:
Offline Policy Optimization with Eligible Actions. CoRR abs/2207.00632 (2022) - [i5]Allen Nie, Yannis Flet-Berliac, Deon R. Jordan, William Steenbergen, Emma Brunskill:
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data. CoRR abs/2210.08642 (2022) - 2021
- [b1]Yannis Flet-Berliac:
Sample-Efficient Deep Reinforcement Learning for Control, Exploration and Safety. (Apprentissage par renforcement profond éfficace pour le contrôle, l'exploration et la sûreté). University of Lille, France, 2021 - [c4]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. ICLR 2021 - [c3]Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux:
Learning Value Functions in Deep Policy Gradients using Residual Variance. ICLR 2021 - [i4]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. CoRR abs/2102.04376 (2021) - 2020
- [c2]Yannis Flet-Berliac, Philippe Preux:
Only Relevant Information Matters: Filtering Out Noisy Samples To Boost RL. IJCAI 2020: 2711-2717 - [i3]Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux:
Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients. CoRR abs/2010.04440 (2020)
2010 – 2019
- 2019
- [i2]Yannis Flet-Berliac, Philippe Preux:
Samples are not all useful: Denoising policy gradient updates using variance. CoRR abs/1904.04025 (2019) - [i1]Yannis Flet-Berliac, Philippe Preux:
High-Dimensional Control Using Generalized Auxiliary Tasks. CoRR abs/1909.11939 (2019) - 2017
- [c1]Benjamin Johansen, Yannis Paul Raymond Flet-Berliac, Maciej Jan Korzepa, Per Sandholm, Niels Henrik Pontoppidan, Michael Kai Petersen, Jakob Eg Larsen:
Hearables in Hearing Care: Discovering Usage Patterns Through IoT Devices. HCI (9) 2017: 39-49
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 19:35 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint