Computer Science > Computation and Language

arXiv:2407.10671 (cs)

[Submitted on 15 Jul 2024 (v1), last revised 10 Sep 2024 (this version, v4)]

Title:Qwen2 Technical Report

Abstract:This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning.
The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach.
To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.

Comments:	26 pages, 1 figure
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.10671 [cs.CL]
	(or arXiv:2407.10671v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.10671

Submission history

From: Xuancheng Ren [view email]
[v1] Mon, 15 Jul 2024 12:35:42 UTC (167 KB)
[v2] Tue, 16 Jul 2024 16:29:30 UTC (167 KB)
[v3] Thu, 18 Jul 2024 02:39:24 UTC (167 KB)
[v4] Tue, 10 Sep 2024 13:25:53 UTC (169 KB)

Computer Science > Computation and Language

Title:Qwen2 Technical Report

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Qwen2 Technical Report

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators