Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback

Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Jun Shern Chan, Daniel del Castillo, Tom Lieberum

Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, PMLR 176:259-272, 2022.

Abstract

We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft, and allowed participants to use any approach they wanted to build agents that could accomplish the tasks. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types. The three winning teams implemented significantly different approaches while achieving similar performance. Interestingly, their approaches performed well on different tasks, validating our choice of tasks to include in the competition. While the outcomes validated the design of our competition, we did not get as many participants and submissions as our sister competition, MineRL Diamond. We speculate about the causes of this problem and suggest improvements for future iterations of the competition.

Cite this Paper

BibTeX


@InProceedings{pmlr-v176-shah22a,
  title = 	 {Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback},
  author =       {Shah, Rohin and Wang, Steven H. and Wild, Cody and Milani, Stephanie and Kanervisto, Anssi and Goecks, Vinicius G. and Waytowich, Nicholas and Watkins-Valls, David and Prakash, Bharat and Mills, Edmund and Garg, Divyansh and Fries, Alexander and Souly, Alexandra and Chan, Jun Shern and del Castillo, Daniel and Lieberum, Tom},
  booktitle = 	 {Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track},
  pages = 	 {259--272},
  year = 	 {2022},
  editor = 	 {Kiela, Douwe and Ciccone, Marco and Caputo, Barbara},
  volume = 	 {176},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--14 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v176/shah22a/shah22a.pdf},
  url = 	 {https://proceedings.mlr.press/v176/shah22a.html},
  abstract = 	 {We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft, and allowed participants to use any approach they wanted to build agents that could accomplish the tasks. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types. The three winning teams implemented significantly different approaches while achieving similar performance. Interestingly, their approaches performed well on different tasks, validating our choice of tasks to include in the competition. While the outcomes validated the design of our competition, we did not get as many participants and submissions as our sister competition, MineRL Diamond. We speculate about the causes of this problem and suggest improvements for future iterations of the competition.}
}

Endnote

%0 Conference Paper
%T Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback
%A Rohin Shah
%A Steven H. Wang
%A Cody Wild
%A Stephanie Milani
%A Anssi Kanervisto
%A Vinicius G. Goecks
%A Nicholas Waytowich
%A David Watkins-Valls
%A Bharat Prakash
%A Edmund Mills
%A Divyansh Garg
%A Alexander Fries
%A Alexandra Souly
%A Jun Shern Chan
%A Daniel del Castillo
%A Tom Lieberum
%B Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track
%C Proceedings of Machine Learning Research
%D 2022
%E Douwe Kiela
%E Marco Ciccone
%E Barbara Caputo	
%F pmlr-v176-shah22a
%I PMLR
%P 259--272
%U https://proceedings.mlr.press/v176/shah22a.html
%V 176
%X We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft, and allowed participants to use any approach they wanted to build agents that could accomplish the tasks. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types. The three winning teams implemented significantly different approaches while achieving similar performance. Interestingly, their approaches performed well on different tasks, validating our choice of tasks to include in the competition. While the outcomes validated the design of our competition, we did not get as many participants and submissions as our sister competition, MineRL Diamond. We speculate about the causes of this problem and suggest improvements for future iterations of the competition.

APA


Shah, R., Wang, S.H., Wild, C., Milani, S., Kanervisto, A., Goecks, V.G., Waytowich, N., Watkins-Valls, D., Prakash, B., Mills, E., Garg, D., Fries, A., Souly, A., Chan, J.S., del Castillo, D. & Lieberum, T.. (2022). Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback. Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, in Proceedings of Machine Learning Research 176:259-272 Available from https://proceedings.mlr.press/v176/shah22a.html.

Related Material

Download PDF