Computer Science > Computation and Language

arXiv:2307.13365 (cs)

[Submitted on 25 Jul 2023 (v1), last revised 27 Jul 2023 (this version, v2)]

Title:Empower Your Model with Longer and Better Context Comprehension

Authors:Yifei Gao, Lei Wang, Jun Fang, Longhua Hu, Jun Cheng

View PDF

Abstract:Recently, with the emergence of numerous Large Language Models (LLMs), the implementation of AI has entered a new era. Irrespective of these models' own capacity and structure, there is a growing demand for LLMs to possess enhanced comprehension of longer and more complex contexts with relatively smaller sizes. Models often encounter an upper limit when processing sequences of sentences that extend beyond their comprehension capacity and result in off-topic or even chaotic responses. While several recent works attempt to address this issue in various ways, they rarely focus on "why models are unable to compensate or strengthen their capabilities on their own". In this paper, we thoroughly investigate the nature of information transfer within LLMs and propose a novel technique called Attention Transition. This technique empowers models to achieve longer and better context comprehension with minimal additional training or impact on generation fluency. Our experiments are conducted on the challenging XSum dataset using LLaMa-7b model with context token length ranging from 800 to 1900. Results demonstrate that we achieve substantial improvements compared with the original generation results evaluated by GPT4.

Comments:	LLM for long context comprehension
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
MSC classes:	68T07, 68T50
Cite as:	arXiv:2307.13365 [cs.CL]
	(or arXiv:2307.13365v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.13365

Submission history

From: Yifei Gao [view email]
[v1] Tue, 25 Jul 2023 09:34:42 UTC (1,602 KB)
[v2] Thu, 27 Jul 2023 10:17:18 UTC (1,602 KB)

Computer Science > Computation and Language

Title:Empower Your Model with Longer and Better Context Comprehension

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Empower Your Model with Longer and Better Context Comprehension

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators