Mohtashami et al., 2023 - Google Patents

Landmark attention: Random-access infinite context length for transformers

Mohtashami et al., 2023

Document ID: 7495208219211273813
Author: Mohtashami A; Jaggi M
Publication year: 2023
Publication venue: arXiv preprint arXiv:2305.16300

External Links

Cited by

Snippet

While Transformers have shown remarkable success in natural language processing, their attention mechanism's large memory requirements have limited their ability to handle longer contexts. Prior approaches, such as recurrent memory or retrieval-based augmentation …

Continue reading at arxiv.org (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme

Similar Documents

Publication	Publication Date	Title
Mohtashami et al.	2023	Landmark attention: Random-access infinite context length for transformers
Mohtashami et al.	2024	Random-access infinite context length for transformers
Fournier et al.	2023	A practical survey on faster and lighter transformers
US11947494B2 (en)	2024-04-02	Organizing prime data elements using a tree data structure
Jadhav et al.	2018	Extractive summarization with swap-net: Sentences and words from alternating pointer networks
US11030997B2 (en)	2021-06-08	Slim embedding layers for recurrent neural language models
EP3238344B1 (en)	2022-03-09	Lossless reduction of data by deriving data from prime data elements resident in a content-associative sieve
Jiang et al.	2018	RIN: Reformulation inference network for context-aware query suggestion
Cormode et al.	2010	Set cover algorithms for very large datasets
US8738547B2 (en)	2014-05-27	System and methods for finding hidden topics of documents and preference ranking documents
Dorier et al.	2014	Omnisc'io: a grammar-based approach to spatial and temporal i/o patterns prediction
US11363296B2 (en)	2022-06-14	Lossless reduction of data by using a prime data sieve and performing multidimensional search and content-associative retrieval on data that has been losslessly reduced using a prime data sieve
CN106777006B (en)	2020-10-23	Parallel hyper-network classification method based on Spark
Huang et al.	2023	Advancing transformer architecture in long-context large language models: A comprehensive survey
EP3311494B1 (en)	2021-12-22	Performing multidimensional search, content-associative retrieval, and keyword-based search and retrieval on data that has been losslessly reduced using a prime data sieve
WO2021231255A1 (en)	2021-11-18	Exploiting locality of prime data for efficient retrieval of data that has been losslessly reduced using a prime data sieve
CN112579870A (en)	2021-03-30	Training method, device and equipment for searching matching model and storage medium
US20220129638A1 (en)	2022-04-28	Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents
EP3387647B1 (en)	2024-05-01	Reduction of audio data and data stored on a block processing storage system
US20210209190A1 (en)	2021-07-08	Method and apparatus for processing matrix data through relaxed pruning
He et al.	2023	An efficient and robust semantic hashing framework for similar text search
Everaert et al.	2023	Gio: Gradient information optimization for training dataset selection
Khassanov et al.	2019	Enriching rare word representations in neural language models by embedding matrix augmentation
Wang et al.	2023	Cta: Hardware-software co-design for compressed token attention mechanism
Cohen et al.	2024	InDi: Informative and Diverse Sampling for Dense Retrieval