Computer Science > Computation and Language

arXiv:2401.04471 (cs)

[Submitted on 9 Jan 2024]

Title:TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models

Authors:Xue Zhang, Xiangyu Shi, Xinyue Lou, Rui Qi, Yufeng Chen, Jinan Xu, Wenjuan Han

Abstract:Large language models (LLMs) and multimodal large language models (MLLMs) have shown excellent general capabilities, even exhibiting adaptability in many professional domains such as law, economics, transportation, and medicine. Currently, many domain-specific benchmarks have been proposed to verify the performance of (M)LLMs in specific fields. Among various domains, transportation plays a crucial role in modern society as it impacts the economy, the environment, and the quality of life for billions of people. However, it is unclear how much traffic knowledge (M)LLMs possess and whether they can reliably perform transportation-related tasks. To address this gap, we propose TransportationGames, a carefully designed and thorough evaluation benchmark for assessing (M)LLMs in the transportation domain. By comprehensively considering the applications in real-world scenarios and referring to the first three levels in Bloom's Taxonomy, we test the performance of various (M)LLMs in memorizing, understanding, and applying transportation knowledge by the selected tasks. The experimental results show that although some models perform well in some tasks, there is still much room for improvement overall. We hope the release of TransportationGames can serve as a foundation for future research, thereby accelerating the implementation and application of (M)LLMs in the transportation domain.

Comments:	Work in Progress
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2401.04471 [cs.CL]
	(or arXiv:2401.04471v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.04471

Submission history

From: Xue Zhang [view email]
[v1] Tue, 9 Jan 2024 10:20:29 UTC (10,743 KB)

Computer Science > Computation and Language

Title:TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators