Nothing Special   »   [go: up one dir, main page]

Mohtashami et al., 2023 - Google Patents

Landmark attention: Random-access infinite context length for transformers

Mohtashami et al., 2023

View PDF
Document ID
7495208219211273813
Author
Mohtashami A
Jaggi M
Publication year
Publication venue
arXiv preprint arXiv:2305.16300

External Links

Snippet

While Transformers have shown remarkable success in natural language processing, their attention mechanism's large memory requirements have limited their ability to handle longer contexts. Prior approaches, such as recurrent memory or retrieval-based augmentation …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme

Similar Documents

Publication Publication Date Title
Mohtashami et al. Landmark attention: Random-access infinite context length for transformers
Mohtashami et al. Random-access infinite context length for transformers
Fournier et al. A practical survey on faster and lighter transformers
US11947494B2 (en) Organizing prime data elements using a tree data structure
Jadhav et al. Extractive summarization with swap-net: Sentences and words from alternating pointer networks
US11030997B2 (en) Slim embedding layers for recurrent neural language models
EP3238344B1 (en) Lossless reduction of data by deriving data from prime data elements resident in a content-associative sieve
Jiang et al. RIN: Reformulation inference network for context-aware query suggestion
Cormode et al. Set cover algorithms for very large datasets
US8738547B2 (en) System and methods for finding hidden topics of documents and preference ranking documents
Dorier et al. Omnisc'io: a grammar-based approach to spatial and temporal i/o patterns prediction
US11363296B2 (en) Lossless reduction of data by using a prime data sieve and performing multidimensional search and content-associative retrieval on data that has been losslessly reduced using a prime data sieve
CN106777006B (en) Parallel hyper-network classification method based on Spark
Huang et al. Advancing transformer architecture in long-context large language models: A comprehensive survey
EP3311494B1 (en) Performing multidimensional search, content-associative retrieval, and keyword-based search and retrieval on data that has been losslessly reduced using a prime data sieve
WO2021231255A1 (en) Exploiting locality of prime data for efficient retrieval of data that has been losslessly reduced using a prime data sieve
CN112579870A (en) Training method, device and equipment for searching matching model and storage medium
US20220129638A1 (en) Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents
EP3387647B1 (en) Reduction of audio data and data stored on a block processing storage system
US20210209190A1 (en) Method and apparatus for processing matrix data through relaxed pruning
He et al. An efficient and robust semantic hashing framework for similar text search
Everaert et al. Gio: Gradient information optimization for training dataset selection
Khassanov et al. Enriching rare word representations in neural language models by embedding matrix augmentation
Wang et al. Cta: Hardware-software co-design for compressed token attention mechanism
Cohen et al. InDi: Informative and Diverse Sampling for Dense Retrieval