FlashAttention-2 Surpassed, 0.33 TB/s Linear O(n) Attention (Alphabase by SCS)

by GeometryKernel | View on Hacker News