Computer Science > Robotics

arXiv:2310.11604 (cs)

[Submitted on 17 Oct 2023 (v1), last revised 17 Jun 2024 (this version, v2)]

Title:Language Models as Zero-Shot Trajectory Generators

Authors:Teyun Kwon, Norman Di Palo, Edward Johns

Abstract:Large Language Models (LLMs) have recently shown promise as high-level planners for robots when given access to a selection of low-level skills. However, it is often assumed that LLMs do not possess sufficient knowledge to be used for the low-level trajectories themselves. In this work, we address this assumption thoroughly, and investigate if an LLM (GPT-4) can directly predict a dense sequence of end-effector poses for manipulation tasks, when given access to only object detection and segmentation vision models. We designed a single, task-agnostic prompt, without any in-context examples, motion primitives, or external trajectory optimisers. Then we studied how well it can perform across 30 real-world language-based tasks, such as "open the bottle cap" and "wipe the plate with the sponge", and we investigated which design choices in this prompt are the most important. Our conclusions raise the assumed limit of LLMs for robotics, and we reveal for the first time that LLMs do indeed possess an understanding of low-level robot control sufficient for a range of common tasks, and that they can additionally detect failures and then re-plan trajectories accordingly. Videos, prompts, and code are available at: this https URL.

Comments:	Published in IEEE Robotics and Automation Letters (Volume: 9, Issue: 7, July 2024, Pages: 6728-6735); 10 pages, 12 figures
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Cite as:	arXiv:2310.11604 [cs.RO]
	(or arXiv:2310.11604v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2310.11604
Journal reference:	IEEE Robotics and Automation Letters (Volume: 9, Issue: 7, July 2024, Pages: 6728-6735)
Related DOI:	https://doi.org/10.1109/LRA.2024.3410155

Submission history

From: Teyun Kwon [view email]
[v1] Tue, 17 Oct 2023 21:57:36 UTC (35,026 KB)
[v2] Mon, 17 Jun 2024 23:57:03 UTC (1,842 KB)

Computer Science > Robotics

Title:Language Models as Zero-Shot Trajectory Generators

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Language Models as Zero-Shot Trajectory Generators

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators