🐙 Implements Flash Attention with sink for gpt-oss-20b; includes test.py. WIP backward pass, varlen support, and community sync to return softmax_lse only.
flash attention sink with flash-attention attention-sink flash-with flash-sink attention-with with-sink flash-attention-with flash-attention-sink flash-with-sink attention-with-sink flash-attention-with-sink
-
Updated
Oct 21, 2025 - Python