default search action

combined dblp search
author search
venue search
publication search

ask others

Masatoshi Uehara

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j4]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/KallusMU24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/KallusMU24
Nathan Kallus, Xiaojie Mao, Masatoshi Uehara:
Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond. J. Mach. Learn. Res. 25: 16:1-16:59 (2024)
[c28]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/KubaULA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/KubaULA24
Kuba Grudzien Kuba, Masatoshi Uehara, Sergey Levine, Pieter Abbeel:
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization. AISTATS 2024: 2449-2457
[c27]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ZhanU0L24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ZhanU0L24
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee:
Provable Reward-Agnostic Preference-Based Reinforcement Learning. ICLR 2024
[c26]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ZhanUKL024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ZhanUKL024
Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun:
Provable Offline Preference-Based Reinforcement Learning. ICLR 2024
[c25]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/Uehara0BHSDTLB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Uehara0BHSDTLB24
Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M. Tseng, Sergey Levine, Tommaso Biancalani:
Feedback Efficient Online Fine-Tuning of Diffusion Models. ICML 2024
[i43]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-05442
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-05442
Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine:
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization. CoRR abs/2401.05442 (2024)
[i42]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-15194
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-15194
Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M. Tseng, Tommaso Biancalani, Sergey Levine:
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control. CoRR abs/2402.15194 (2024)
[i41]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-16359
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-16359
Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M. Tseng, Sergey Levine, Tommaso Biancalani:
Feedback Efficient Online Fine-Tuning of Diffusion Models. CoRR abs/2402.16359 (2024)
[i40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-04236
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-04236
Zihao Li, Hui Lan, Vasilis Syrgkanis, Mengdi Wang, Masatoshi Uehara:
Regularized DeepIV with Model Selection. CoRR abs/2403.04236 (2024)
[i39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19673
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19673
Masatoshi Uehara, Yulai Zhao, Ehsan Hajiramezanali, Gabriele Scalia, Gökcen Eraslan, Avantika Lal, Sergey Levine, Tommaso Biancalani:
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models. CoRR abs/2405.19673 (2024)
[i38]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12120
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12120
Yulai Zhao, Masatoshi Uehara, Gabriele Scalia, Tommaso Biancalani, Sergey Levine, Ehsan Hajiramezanali:
Adding Conditional Control to Diffusion Models with Reinforcement Learning. CoRR abs/2406.12120 (2024)
[i37]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-13734
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-13734
Masatoshi Uehara, Yulai Zhao, Tommaso Biancalani, Sergey Levine:
Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review. CoRR abs/2407.13734 (2024)
[i36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-08252
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-08252
Xiner Li, Yulai Zhao, Chenyu Wang, Gabriele Scalia, Gökcen Eraslan, Surag Nair, Tommaso Biancalani, Aviv Regev, Sergey Levine, Masatoshi Uehara:
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding. CoRR abs/2408.08252 (2024)
2023
[c24]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/colt/BennettKMNSU23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/BennettKMNSU23
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara:
Inference on Strongly Identified Functionals of Weakly Identified Functions. COLT 2023: 2265
[c23]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/colt/BennettKMNSU23a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/BennettKMNSU23a
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara:
Minimax Instrumental Variable Regression and L₂ Convergence Guarantees without Identification or Closedness. COLT 2023: 2291-2318
[c22]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ZhanU0L23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ZhanU0L23
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee:
PAC Reinforcement Learning for Predictive State Representations. ICLR 2023
[c21]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/UeharaSLK023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/UeharaSLK023
Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun:
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings. ICML 2023: 34615-34641
[c20]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/WuU023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/WuU023
Runzhe Wu, Masatoshi Uehara, Wen Sun:
Distributional Offline Policy Evaluation with Predictive Error Guarantees. ICML 2023: 37685-37712
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/kdd/KiyoharaUNSYS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/kdd/KiyoharaUNSYS23
Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito:
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior. KDD 2023: 1154-1163
[c18]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/UeharaKBC0KS023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/UeharaKBC0KS023
Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun:
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs. NeurIPS 2023
[c17]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/UeharaKL023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/UeharaKL023
Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun:
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage. NeurIPS 2023
[i35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-02392
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-02392
Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun:
Refined Value-Based Offline RL under Realizability and Partial Coverage. CoRR abs/2302.02392 (2023)
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-05404
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-05404
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara:
Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness. CoRR abs/2302.05404 (2023)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-09456
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-09456
Runzhe Wu, Masatoshi Uehara, Wen Sun:
Distributional Offline Policy Evaluation with Predictive Error Guarantees. CoRR abs/2302.09456 (2023)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-14816
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-14816
Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun:
Provable Offline Reinforcement Learning with Human Feedback. CoRR abs/2305.14816 (2023)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18505
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18505
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee:
How to Query Human Feedback Efficiently in RL? CoRR abs/2305.18505 (2023)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15098
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15098
Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito:
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior. CoRR abs/2306.15098 (2023)
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-13793
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-13793
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara:
Source Condition Double Robust Inference on Functionals of Inverse Problems. CoRR abs/2307.13793 (2023)
2022
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/ior/KallusU22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ior/KallusU22
Nathan Kallus, Masatoshi Uehara:
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning. Oper. Res. 70(6): 3282-3302 (2022)
[c16]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/Uehara022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/Uehara022
Masatoshi Uehara, Wen Sun:
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage. ICLR 2022
[c15]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/UeharaZS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/UeharaZS22
Masatoshi Uehara, Xuezhou Zhang, Wen Sun:
Representation Learning for Online and Offline RL in Low-rank MDPs. ICLR 2022
[c14]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/ShiUHJ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ShiUHJ22
Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang:
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes. ICML 2022: 20057-20094
[c13]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/ZhangSUWAS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ZhangSUWAS22
Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun:
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach. ICML 2022: 26517-26547
[c12]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/UeharaSLK022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/UeharaSLK022
Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun:
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems. NeurIPS 2022
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2201-04469
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-04469
Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masatoshi Uehara, Masahiro Nomura, Chao Qin:
Optimal Fixed-Budget Best Arm Identification using the Augmented Inverse Probability Weighting Estimator in Two-Armed Gaussian Bandits with Unknown Variances. CoRR abs/2201.04469 (2022)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2202-00063
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-00063
Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun:
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach. CoRR abs/2202.00063 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-12020
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-12020
Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun:
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems. CoRR abs/2206.12020 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-12081
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-12081
Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun:
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings. CoRR abs/2206.12081 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-05738
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-05738
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee:
PAC Reinforcement Learning for Predictive State Representations. CoRR abs/2207.05738 (2022)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-13081
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-13081
Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun:
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs. CoRR abs/2207.13081 (2022)
[i22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-06355
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-06355
Masatoshi Uehara, Chengchun Shi, Nathan Kallus:
A Review of Off-Policy Evaluation in Reinforcement Learning. CoRR abs/2212.06355 (2022)
2021
[j2]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/MatsudaUH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MatsudaUH21
Takeru Matsuda, Masatoshi Uehara, Aapo Hyvärinen:
Information criteria for non-normalized models. J. Mach. Learn. Res. 22: 158:1-158:33 (2021)
[c11]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/colt/HuKU21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/HuKU21
Yichun Hu, Nathan Kallus, Masatoshi Uehara:
Fast Rates for the Regret of Offline Reinforcement Learning. COLT 2021: 2462
[c10]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/KallusSU21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KallusSU21
Nathan Kallus, Yuta Saito, Masatoshi Uehara:
Optimal Off-Policy Evaluation from Multiple Logging Policies. ICML 2021: 5247-5256
[c9]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/ChangUSKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/ChangUSKS21
Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun:
Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage. NeurIPS 2021: 965-979
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-00479
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-00479
Yichun Hu, Nathan Kallus, Masatoshi Uehara:
Fast Rates for the Regret of Offline Reinforcement Learning. CoRR abs/2102.00479 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-02981
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-02981
Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie:
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency. CoRR abs/2102.02981 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2103-14029
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-14029
Nathan Kallus, Xiaojie Mao, Masatoshi Uehara:
Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach. CoRR abs/2103.14029 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2106-03207
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-03207
Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun:
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage. CoRR abs/2106.03207 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-06226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-06226
Masatoshi Uehara, Wen Sun:
Pessimistic Model-based Offline RL: PAC Bounds and Posterior Sampling under Partial Coverage. CoRR abs/2107.06226 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04652
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04652
Masatoshi Uehara, Xuezhou Zhang, Wen Sun:
Representation Learning for Online and Offline RL in Low-rank MDPs. CoRR abs/2110.04652 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-06784
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-06784
Chengchun Shi, Masatoshi Uehara, Nan Jiang:
A Minimax Learning Approach to Off-Policy Evaluation in Partially Observable Markov Decision Processes. CoRR abs/2111.06784 (2021)
2020
[j1]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/KallusU20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/KallusU20
Nathan Kallus, Masatoshi Uehara:
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes. J. Mach. Learn. Res. 21: 167:1-167:63 (2020)
[c8]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/UeharaKTM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/UeharaKTM20
Masatoshi Uehara, Takafumi Kanamori, Takashi Takenouchi, Takeru Matsuda:
A Unified Statistically Efficient Estimation Framework for Unnormalized Models. AISTATS 2020: 809-819
[c7]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/UeharaMK20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/UeharaMK20
Masatoshi Uehara, Takeru Matsuda, Jae Kwang Kim:
Imputation estimators for unnormalized models with missing data. AISTATS 2020: 831-841
[c6]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/KallusU20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KallusU20
Nathan Kallus, Masatoshi Uehara:
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation. ICML 2020: 5078-5088
[c5]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/KallusU20a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KallusU20a
Nathan Kallus, Masatoshi Uehara:
Statistically Efficient Off-Policy Policy Gradients. ICML 2020: 5089-5100
[c4]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/UeharaHJ20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/UeharaHJ20
Masatoshi Uehara, Jiawei Huang, Nan Jiang:
Minimax Weight and Q-Function Learning for Off-Policy Evaluation. ICML 2020: 9659-9668
[c3]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/KallusU20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KallusU20
Nathan Kallus, Masatoshi Uehara:
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies. NeurIPS 2020
[c2]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/UeharaKY20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/UeharaKY20
Masatoshi Uehara, Masahiro Kato, Shota Yasui:
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift. NeurIPS 2020
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-04014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-04014
Nathan Kallus, Masatoshi Uehara:
Statistically Efficient Off-Policy Policy Gradients. CoRR abs/2002.04014 (2020)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-11642
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-11642
Masahiro Kato, Masatoshi Uehara, Shota Yasui:
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift. CoRR abs/2002.11642 (2020)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-03886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-03886
Nathan Kallus, Masatoshi Uehara:
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning. CoRR abs/2006.03886 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-03900
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-03900
Nathan Kallus, Masatoshi Uehara:
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies. CoRR abs/2006.03900 (2020)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2010-11002
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-11002
Nathan Kallus, Yuta Saito, Masatoshi Uehara:
Optimal Off-Policy Evaluation from Multiple Logging Policies. CoRR abs/2010.11002 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c1]
- view
- export record
  dblp key:
  - conf/nips/KallusU19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KallusU19
Nathan Kallus, Masatoshi Uehara:
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning. NeurIPS 2019: 3320-3329
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1901-07710
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1901-07710
Masatoshi Uehara, Takafumi Kanamori, Takashi Takenouchi, Takeru Matsuda:
Unified estimation framework for unnormalized models with statistical efficiency. CoRR abs/1901.07710 (2019)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1903-03630
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1903-03630
Masatoshi Uehara, Takeru Matsuda, Jae Kwang Kim:
Imputation estimators for unnormalized models with missing data. CoRR abs/1903.03630 (2019)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1905-05976
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-05976
Takeru Matsuda, Masatoshi Uehara, Aapo Hyvärinen:
Information criteria for non-normalized models. CoRR abs/1905.05976 (2019)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1906-03735
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1906-03735
Nathan Kallus, Masatoshi Uehara:
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning. CoRR abs/1906.03735 (2019)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1908-08526
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1908-08526
Nathan Kallus, Masatoshi Uehara:
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes. CoRR abs/1908.08526 (2019)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1909-05850
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-05850
Nathan Kallus, Masatoshi Uehara:
Efficiently Breaking the Curse of Horizon: Double Reinforcement Learning in Infinite-Horizon Processes. CoRR abs/1909.05850 (2019)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1910-12809
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-12809
Masatoshi Uehara, Nan Jiang:
Minimax Weight and Q-Function Learning for Off-Policy Evaluation. CoRR abs/1910.12809 (2019)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1912-12945
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-12945
Nathan Kallus, Xiaojie Mao, Masatoshi Uehara:
Localized Debiased Machine Learning: Efficient Estimation of Quantile Treatment Effects, Conditional Value at Risk, and Beyond. CoRR abs/1912.12945 (2019)
2018
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1808-07983
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-07983
Masatoshi Uehara, Takeru Matsuda, Fumiyasu Komaki:
Analysis of Noise Contrastive Estimation from the Perspective of Asymptotic Variance. CoRR abs/1808.07983 (2018)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.