Mohtashami et al., 2023 - Google Patents
Landmark attention: Random-access infinite context length for transformersMohtashami et al., 2023
View PDF- Document ID
- 7495208219211273813
- Author
- Mohtashami A
- Jaggi M
- Publication year
- Publication venue
- arXiv preprint arXiv:2305.16300
External Links
Snippet
While Transformers have shown remarkable success in natural language processing, their attention mechanism's large memory requirements have limited their ability to handle longer contexts. Prior approaches, such as recurrent memory or retrieval-based augmentation …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mohtashami et al. | Landmark attention: Random-access infinite context length for transformers | |
Mohtashami et al. | Random-access infinite context length for transformers | |
Fournier et al. | A practical survey on faster and lighter transformers | |
US11947494B2 (en) | Organizing prime data elements using a tree data structure | |
Jadhav et al. | Extractive summarization with swap-net: Sentences and words from alternating pointer networks | |
US11030997B2 (en) | Slim embedding layers for recurrent neural language models | |
EP3238344B1 (en) | Lossless reduction of data by deriving data from prime data elements resident in a content-associative sieve | |
Jiang et al. | RIN: Reformulation inference network for context-aware query suggestion | |
Cormode et al. | Set cover algorithms for very large datasets | |
US8738547B2 (en) | System and methods for finding hidden topics of documents and preference ranking documents | |
Dorier et al. | Omnisc'io: a grammar-based approach to spatial and temporal i/o patterns prediction | |
US11363296B2 (en) | Lossless reduction of data by using a prime data sieve and performing multidimensional search and content-associative retrieval on data that has been losslessly reduced using a prime data sieve | |
CN106777006B (en) | Parallel hyper-network classification method based on Spark | |
Huang et al. | Advancing transformer architecture in long-context large language models: A comprehensive survey | |
EP3311494B1 (en) | Performing multidimensional search, content-associative retrieval, and keyword-based search and retrieval on data that has been losslessly reduced using a prime data sieve | |
WO2021231255A1 (en) | Exploiting locality of prime data for efficient retrieval of data that has been losslessly reduced using a prime data sieve | |
CN112579870A (en) | Training method, device and equipment for searching matching model and storage medium | |
US20220129638A1 (en) | Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents | |
EP3387647B1 (en) | Reduction of audio data and data stored on a block processing storage system | |
US20210209190A1 (en) | Method and apparatus for processing matrix data through relaxed pruning | |
He et al. | An efficient and robust semantic hashing framework for similar text search | |
Everaert et al. | Gio: Gradient information optimization for training dataset selection | |
Khassanov et al. | Enriching rare word representations in neural language models by embedding matrix augmentation | |
Wang et al. | Cta: Hardware-software co-design for compressed token attention mechanism | |
Cohen et al. | InDi: Informative and Diverse Sampling for Dense Retrieval |