default search action

combined dblp search
author search
venue search
publication search

ask others

Rohin Shah

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-00745
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-00745
János Kramár, Tom Lieberum, Rohin Shah, Neel Nanda:
AtP*: An efficient and scalable method for localizing LLM behaviour to components. CoRR abs/2403.00745 (2024)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-13793
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-13793
Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Grégoire Delétang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca D. Dragan, Rohin Shah, Allan Dafoe, Toby Shevlane:
Evaluating Frontier Models for Dangerous Capabilities. CoRR abs/2403.13793 (2024)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-16014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-16014
Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah, Neel Nanda:
Improving Dictionary Learning with Gated Sparse Autoencoders. CoRR abs/2404.16014 (2024)
[i22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-04622
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-04622
Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah:
On scalable oversight with weak LLMs judging strong LLMs. CoRR abs/2407.04622 (2024)
[i21]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-05147
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-05147
Tom Lieberum, Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Nicolas Sonnerat, Vikrant Varma, János Kramár, Anca D. Dragan, Rohin Shah, Neel Nanda:
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2. CoRR abs/2408.05147 (2024)
2023
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/hri/BobuLSBD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/hri/BobuLSBD23
Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan:
SIRL: Similarity-based Implicit Representation Learning. HRI 2023: 565-574
[c13]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/MilaniKRSHS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MilaniKRSHS23
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah:
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks. NeurIPS 2023
[i20]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-00810
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-00810
Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan:
SIRL: Similarity-based Implicit Representation Learning. CoRR abs/2301.00810 (2023)
[i19]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-13512
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-13512
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada P. Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen R. Novoseller, Vinicius G. Goecks, Nicholas R. Waytowich, David Watkins, Josh Miller, Rohin Shah:
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition. CoRR abs/2303.13512 (2023)
[i18]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-09458
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-09458
Tom Lieberum, Matthew Rahtz, János Kramár, Neel Nanda, Geoffrey Irving, Rohin Shah, Vladimir Mikulik:
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla. CoRR abs/2307.09458 (2023)
[i17]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-02390
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-02390
Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar:
Explaining grokking through circuit efficiency. CoRR abs/2309.02390 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02405
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02405
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah:
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks. CoRR abs/2312.02405 (2023)
[i15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-10029
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-10029
Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah:
Challenges with unsupervised LLM knowledge discovery. CoRR abs/2312.10029 (2023)
2022
[i14]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-07123
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-07123
Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas R. Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Jun Shern Chan, Daniel del Castillo, Tom Lieberum:
Retrospective on the 2021 BASALT Competition on Learning from Human Feedback. CoRR abs/2204.07123 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-07886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-07886
Xin Chen, Sam Toyer, Cody Wild, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H. Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah:
An Empirical Investigation of Representation Learning for Imitation. CoRR abs/2205.07886 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-01790
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-01790
Rohin Shah, Vikrant Varma, Ramana Kumar, Mary Phuong, Victoria Krakovna, Jonathan Uesato, Zac Kenton:
Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals. CoRR abs/2210.01790 (2022)
2021
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/atal/KnottCDCHDS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/KnottCDCHDS21
Paul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, Anca D. Dragan, Rohin Shah:
Evaluating the Robustness of Collaborative Agents. AAMAS 2021: 1560-1562
[c11]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/LindnerSAD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/LindnerSAD21
David Lindner, Rohin Shah, Pieter Abbeel, Anca D. Dragan:
Learning What To Do by Simulating the Past. ICLR 2021
[c10]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/ChenCTWEFLAWL0A21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/ChenCTWEFLAWL0A21
Cynthia Chen, Xin Chen, Sam Toyer, Cody Wild, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H. Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah:
An Empirical Investigation of Representation Learning for Imitation. NeurIPS Datasets and Benchmarks 2021
[c9]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/MilaniKRSHMGCSZYLGHLMLRHMIHKLCKPSNGWWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MilaniKRSHMGCSZYLGHLMLRHMIHKLCKPSNGWWM21
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada P. Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Lucas Cazzonelli, Cedric Kulbach, Nicholas Popovic, Marvin Schweizer, Ellen R. Novoseller, Vinicius G. Goecks, Nicholas R. Waytowich, David Watkins, Josh Miller, Rohin Shah:
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition. NeurIPS (Competition and Demos) 2021: 171-188
[c8]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/ShahWWMKGWWPMGF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/ShahWWMKGWWPMGF21
Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas R. Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Jun Shern Chan, Daniel del Castillo, Tom Lieberum:
Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback. NeurIPS (Competition and Demos) 2021: 259-272
[c7]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/TurnerSSCT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TurnerSSCT21
Alexander Matt Turner, Logan Smith, Rohin Shah, Andrew Critch, Prasad Tadepalli:
Optimal Policies Tend To Seek Power. NeurIPS 2021: 23063-23074
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-05507
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-05507
Paul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, Anca D. Dragan, Rohin Shah:
Evaluating the Robustness of Collaborative Agents. CoRR abs/2101.05507 (2021)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-07691
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-07691
Rachel Freedman, Rohin Shah, Anca D. Dragan:
Choice Set Misspecification in Reward Inference. CoRR abs/2101.07691 (2021)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2103-12142
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-12142
Dmitrii Krasheninnikov, Rohin Shah, Herke van Hoof:
Combining Reward Information from Multiple Sources. CoRR abs/2103.12142 (2021)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-03946
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-03946
David Lindner, Rohin Shah, Pieter Abbeel, Anca D. Dragan:
Learning What To Do by Simulating the Past. CoRR abs/2104.03946 (2021)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-01969
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-01969
Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William H. Guss, Sharada P. Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca D. Dragan:
The MineRL BASALT Competition on Learning from Human Feedback. CoRR abs/2107.01969 (2021)
2020
[b1]
- view
  - electronic edition @ escholarship.org
  - no references & citations available
- export record
  dblp key:
  - phd/us/Shah20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/us/Shah20
Rohin Shah:
Extracting and Using Preference Information from the State of the World. University of California, Berkeley, USA, 2020
[c6]
- view
  - electronic edition @ ceur-ws.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/ijcai/FreedmanSD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/FreedmanSD20
Rachel Freedman, Rohin Shah, Anca D. Dragan:
Choice Set Misspecification in Reward Inference. AISafety@IJCAI 2020
[c5]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/ToyerSCR20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/ToyerSCR20
Sam Toyer, Rohin Shah, Andrew Critch, Stuart Russell:
The MAGICAL Benchmark for Robust Imitation. NeurIPS 2020
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2011-00401
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-00401
Sam Toyer, Rohin Shah, Andrew Critch, Stuart Russell:
The MAGICAL Benchmark for Robust Imitation. CoRR abs/2011.00401 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c4]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ShahKAAD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ShahKAAD19
Rohin Shah, Dmitrii Krasheninnikov, Jordan Alexander, Pieter Abbeel, Anca D. Dragan:
Preferences Implicit in the State of the World. ICLR (Poster) 2019
[c3]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/ShahGAD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ShahGAD19
Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca D. Dragan:
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference. ICML 2019: 5670-5679
[c2]
- view
- export record
  dblp key:
  - conf/nips/CarrollSHGSAD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CarrollSHGSAD19
Micah Carroll, Rohin Shah, Mark K. Ho, Tom Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca D. Dragan:
On the Utility of Learning about Humans for Human-AI Coordination. NeurIPS 2019: 5175-5186
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-04198
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-04198
Rohin Shah, Dmitrii Krasheninnikov, Jordan Alexander, Pieter Abbeel, Anca D. Dragan:
Preferences Implicit in the State of the World. CoRR abs/1902.04198 (2019)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1906-09624
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1906-09624
Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca D. Dragan:
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference. CoRR abs/1906.09624 (2019)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1910-05789
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-05789
Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca D. Dragan:
On the Utility of Learning about Humans for Human-AI Coordination. CoRR abs/1910.05789 (2019)
2018
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1809-03060
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1809-03060
Sören Mindermann, Rohin Shah, Adam Gleave, Dylan Hadfield-Menell:
Active Inverse Reward Design. CoRR abs/1809.03060 (2018)
2016
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/ShahTB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ShahTB16
Rohin Shah, Emina Torlak, Rastislav Bodík:
SIMPL: A DSL for Automatic Specialization of Inference Algorithms. CoRR abs/1604.04729 (2016)
2014
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/pldi/PhothilimthanaJSTCB14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pldi/PhothilimthanaJSTCB14
Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah E. Chasins, Rastislav Bodík:
Chlorophyll: synthesis-aided compiler for low-power spatial architectures. PLDI 2014: 396-407

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.