Speculative KV coding: losslessly compressing KV cache by up to ~4×

by kkm | View on Hacker News