Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Transformer based on a variant of attention that is linear complexity in respect to sequence length
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
AddressSanitizer, ThreadSanitizer, MemorySanitizer
Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
Unified KV Cache Compression Methods for Auto-Regressive Models
A beautiful, simple, clean, and responsive Jekyll theme for academics
An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
This repository contains the code for the paper "TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax"
Official repository of InLine attention (NeurIPS 2024)
Set of tools to assess and improve LLM security.
Official repository of FLatten Transformer (ICCV2023)
PyTorch extensions for high performance and large scale training.
Pytorch implementation of Llama 3.2 1B architecture barebones + nuggets of wisdom
Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer
A toy compiler for NumPy array expressions that uses e-graphs and MLIR
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Toying with displaying dispatch dependency in IREE
siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems
Tile primitives for speedy kernels
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Generate SQL from TableGen code - This is part of the tutorial "How to write a TableGen backend" in 2021 LLVM Developers' Meeting.