FairyFuse: Multiplication-Free LLM Inference on CPUs via Fused Ternary Kernels

by PaulHoule | View on Hacker News